Filling Missing Dates for SQL Server Query Output using CTE

Last week one of my colleagues requested me to help him write a query to fill in missing dates in query output. I came across couple of solutions, neither seemed convenient to me. So, I compiled my own using recursive CTE or Common Table Expression.

Problem Statement

Let's say we have a table which contains incoming call records of a customer care from 1st to 10th June, 2021. In some days, there is no call record. If we run the GROUP BY statement on datetime column, some days will be missing. Desired output is, missing dates will be 0 value. Sample output will be below:

Query

SELECT CONVERT(varchar(10),B.call_time,111) AS OriginalDate, COUNT(*) as total
FROM Test1 B
GROUP BY CONVERT(varchar(10),B.call_time,111)
ORDER BY CONVERT(varchar(10),B.call_time,111)

Sample Output

Desired Output

My Approach to Solution

Rather than using simple GROUP BY query, CTE and SUB QUERY are used. Recursive CTE is used to generate the date range and LEFT OUTER JOIN is used to combine the value with the date. Let's explain step-by-step.

CTE/Common Table Expression

CTE or Common Table Expression specifies a temporary named result set which is derived from a simple query and defined within the execution scope of a single SELECT/INSERT/UPDATE/DELETE/MERGE/CREATE VIEW statement. It can refer to itself also which is called recursive CTE.

Preparing Data

IF OBJECT_ID('Test1', 'U') IS NOT NULL
    DROP TABLE Test1;
GO

-- Create the table
CREATE TABLE Test1(
    call_time datetime,
    name    varchar(10) default ('Mehedi')
)
GO
-- Populate with sample data
INSERT INTO Test1 (call_time, name)
VALUES ('2021-06-01 08:00','A')
,('2021-06-01 09:05','C')
,('2021-06-01 12:50','E')
,('2021-06-01 16:17','D')
,('2021-06-01 18:53','G')

,('2021-06-03 11:07','F')
,('2021-06-03 13:09','A')
,('2021-06-03 16:26','E')
,('2021-06-03 19:56','C')
,('2021-06-03 21:24','A')

,('2021-06-04 19:13','A')
,('2021-06-04 11:45','B')
,('2021-06-04 15:02','C')

,('2021-06-08 23:02','A')
,('2021-06-09 03:04','E')

Build The Query

First, we will write a CTE which will generate all the dates within the date range.

DECLARE @StartDate DATE, @EndDate DATE
SET @StartDate = '2021-11-01'
SET @EndDate = '2021-11-08'

;WITH cte AS
(    SELECT @StartDate AS sDate
    UNION ALL
    SELECT DATEADD(DAY,1,sDate)
    FROM cte
    WHERE sDate < @EndDate
)

SELECT  sDate
FROM cte;

Now this CTE is will be refactored to make a sub query with LEFT OUTER JOIN so that the date which does not have the value appears and contains 0 value.

DECLARE @startdate DATETIME = '2021-06-01'
DECLARE @endDate DATETIME = '2021-06-10'

;WITH cte
    AS
    (
        SELECT @startdate as sDate
        UNION All
        SELECT DATEADD(day,1,sDate) From cte where DATEADD(day,1,sDate) <= @endDate
    )
    SELECT
    C.OriginalDate
    ,C.total
    FROM
    (
        SELECT CONVERT(varchar(10),A.sDate,111) AS OriginalDate, COUNT(B.call_time) as total
        FROM cte A
        LEFT OUTER JOIN Test1 B
        ON A.sDate = CONVERT(varchar(10),B.call_time,111)
        GROUP by CONVERT(varchar(10),A.sDate,111)
    ) C
   ORDER BY C.OriginalDate

Final Output

Conclusion

Hope, it will be helpful for you. Happy TSQLing!

Add comment