3 Practical Tips using TSQL Windows Functions

Windows functions are one of the most powerful T-SQL programming elements.  These functions return a single aggregated value for each row within a set of rows (or window).  I’ve spoken with several good programmers who have not used SQL extensively.  They are often surprised by the power of Windows functions and how performant they can be on larger datasets.  This post is not designed to teach T-SQL or the intricacies of Windows function.  It will provide 3 practical ways to effectively use them.

TSQL Windows Function to Calculate Running Totals

Being able to calculate running totals is super useful. In the following example, I have a project with a start and end month… and I’m curious to see how many workdays (total days less weekends and holidays) that represents.  If I know the current month, I can then easily work out how many workdays to complete and other useful things.

So, let’s assume there is a project table called DimProject that has the project name, project start and end months, and also the current month (Jul 2021 in this case).  And a yearMonth table that has each month and the number of workdays.  See the image below.

TSQL Windws Functions

I can now join the DimProject table to the YearMonth table to see the project by month, workdays each month and importantly, I use a windows function to calculate the running total of workdays each month.

Here is the TSQL:

SELECT P.yearMonthCurrent, YM.yearMonth, P.projectName, Ym.workDays,
    SUM(YM.workDays) OVER(
            PARTITION BY P.projectKey 
            ORDER BY YM.yearMonth 
            ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
        ) AS workDaysRunningTotal
FROM dbo.DimProject P
LEFT JOIN dbo.DimDateYearMonthCount YM on YM.yearMonth >= P.yearMonthStart AND YM.yearMonth <= P.yearMonthEnd

Here is the result:

TSQL Windows Function Running Totals

For each month a window is created between the first row and the current month and then we simply sum the WorkDays.  Perfect.

TSQL Windows Function to Calculate Rolling Averages

Calculating and visualising rolling averages can be very useful in understanding a given data set. Fortunately, calculating them is also straightforward.

Let's assume we have a simple table that contains sales by month.  See Below.

T-SQL Windows Functions - Rolling Average

Let's say we want to calculate a 3-month rolling average.  In Nov 2021 or 202111 using the YYYYMM format the 3 months are Sep = 11, Oct = 9, and Nov = 10... and the average is 10.   To calculate this for all months we can use the following T-SQL... and of course the output we could push to a table, or show in a view, or use in a larger code block (stored procedure) to help calculate other relevant information.

SELECT YearMonth, QuantitySold, 
		AVG(QuantitySold) OVER (
		        ORDER BY YearMonth 
		        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
            ) as ThreeMthRollingAvg
FROM [dbo].[QuantityByMonth]

Here is the result, and if you review 202111 you'll see the 3 month rolling average is 10.

T-SQL Windows Function Rolling Average

Effectively the code is creating a window for each month that includes the current row and the 2 preceding rows.  This window provides the 3 quantities and the AVG function generates the desired result.  Simple.

TSQL Windows Function to Remove Duplicates

It would be remiss to not mention removing duplicates.  The ROW_NUMBER() windows function is perhaps the best-known windows function, and using it to identify and/or remove duplicates is a common use-case.

The following code block creates and populates a table with multiple Department records. Initially, the table contains 11 records but several of these records have the same Department code.

The code assumes we need to clean this up and display just the current (or most recent) updates.  As such it identifies duplicate codes and deletes all but the most recent.  It achieves this by creating a partition for each department ordered by update date with the most recent first.  The most recent is given the ROW_NUMBER value 1, the next 2 and so on.  The code deletes all records that have been given a value greater than 1.

DROP TABLE IF EXISTS dbo.DimDepartment_DeDupeExample
	CREATE TABLE dbo.DimDepartment_DeDupeExample (
		DepartmentCode nvarchar(10) NOT NULL,
		DepartmentDesc nvarchar(50) NOT NULL,
		UpdateDate datetime NOT NULL
	)
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D002', N'Marketing', '2022-01-01')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D003', N'Sales', '2022-01-01')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D001', N'IT', '2022-01-01')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D004', N'Finance', '2022-01-01')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D001', N'Information Technology', '2022-03-20')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D005', N'Executive', '2022-03-22')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D003', N'Sales Direct', '2022-04-05')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D001', N'Technology', '2022-04-05')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D006', N'Sales Indirect', '2022-04-05')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D007', N'Support', '2022-04-05')
	INSERT dbo.DimDepartment_DeDupeExample (DepartmentCode, DepartmentDesc, UpdateDate) 
	VALUES (N'D007', N'Sales Support', '2022-04-06')

	SELECT * FROM dbo.DimDepartment_DeDupeExample

	DELETE DeDupe FROM (
		SELECT RN = ROW_NUMBER() OVER (
		        PARTITION BY DepartmentCode 
		        ORDER BY UpdateDate DESC)
		FROM dbo.DimDepartment_DeDupeExample 
	) Dedupe
	WHERE RN > 1

	SELECT * FROM dbo.DimDepartment_DeDupeExample 
	ORDER BY DepartmentCode

Initial Table

TSQL Windows Functions ROW_NUMBER Remove Duplicates Source

Result Table

TSQL Windows Functions ROW_NUMBER Remove Duplicates Result

1 thought on “3 Practical Tips using TSQL Windows Functions”

Comments are closed.