Introduction to SQL LAG()
Introduced in SQL server 2012, the LAG() function is used to access a row at a particular offset which is before the current row. You can access the data of the previous rows in the current row. So, it can be useful if you want to do some calculations or you need a reference to the previous values. It is basically an analytical function using which we can compare values of the current row with the previous row.
The following shows the common syntax of a LAG() in SQL Server:
LAG(return_value ,offset [,default])
[PARTITION BY partition_expression, ... ] ORDER BY sort_expression [ASC or DESC], ...
In the above expression:
Step 1: We use the LAG() function as a part of the SELECT statement. The LAG() function takes 3 parameters:
- Return_value: Based on the offset specified the value will be returned. It can be any expression that returns a scalar value or it can even be just the name of the column. Make sure it is not an analytical function. This is a required parameter
- Offset: This determines the number of rows you want to lag behind to fetch the data. This is an optional parameter and by default, it will lag behind one row. It can be a subquery, column, or any expression which is evaluated to be a positive integer or can be cast or converted to a bigint. Just make sure that you don’t use analytical functions or negative values as an offset.
- Default: The default value is returned if the offset goes out of the range of the partition or it is out of scope. It can be any expression that returns a scalar value or it can even be just the name of the column. Make sure it is not an analytical function. This is an optional parameter and by default NULL value will be returned if no data is found.
Step 2: The second part of the function in which we specify how do we present the previous rows data using OVER clause and it has two parts:
- PARTITION BY clause is optional and it is used to group the data obtained using FROM clause using If this is not specified then the entire result set is considered as a single group
- ORDER BY clause is required and it sorts the data using sort_expression in ASC or DESC format. If partition_expressionis specified then the ORDER BY clause determines the order of data within the partition.
Step 3: The return type of the expression is the data type of the specified return_value. NULL is returned if default value is set to NULL and the expression is nullable.
- LAG() function is non-deterministic as opposed to deterministic functions which return the same result at any time they are called this function may return a different results at the different times even if the database state remains the same.
Examples to Implement SQL LAG()
Below are the examples to Implement SQL LAG():
1. Simple example with LAG() function
SELECT name, gender, salary,
LAG(salary)OVER (ORDER BY salary)
Explanation: This is the most basic example in which we select name, gender, and salary from the Employee table and used the LAG function to get the salary of the previous person in the LAG column. As you can see Mark gets NULL because there is no previous salary and also, we have not specified any default value, and offset is 1 so we get the immediately previous value.
2. A bit complex example with LAG() function
SELECT name, gender, salary,
LAG(salary, 2,-1)OVER (PARTITION BY gender ORDERBY salary)AS LAG
Explanation: In this example, we select name, gender, and salary from the Employee table and used LAG function to get the salary of the previous person in LAG column as done in the previous example but also, we partition the data using the gender column.So, the LAG function works within the confines of the partition. What we have done is we have set the offset to 2 and default value to -1. As we can see in the example for Pam and Sara we cannot fetch the values of previous rows which is lags 2 rows so we get the default value -1 and similarly in the Male partition we get the result.
3. Comparison of values between years
SELECT BusinessEntityID,YEAR(QuotaDate)AS YearOfSale,SalesQuota AS CurrentYearQuota,
LAG(SalesQuota, 1,0)OVER (ORDERBY YEAR (QuotaDate)) AS PreviousYearQuota
WHERE BusinessEntityID = 275 and YEAR(QuotaDate)IN('2011','2012');
Explanation: As we can see in this example LAG function is used to compare the sales of an employee in the current year and the previous year and if there is no lag then we have returned 0.00.
4. Comparison of values within partitions
SELECT BusinessEntityID,TerritoryName,SalesYTD as SalesYearToDate,
LAG(SalesYTD, 1,0)OVER (PARTITION BY TerritoryName ORDER BY YEAR(SalesYTD))AS PreviousRepSales
WHERE TerritoryName IN(N'Northwest',N'Canada')
ORDER BY TerritoryName;
Explanation: In this example, we compare the year-to-date sales between employees using the PARTITION BY clause which is used to divide the result set by TerritoryName. The computation of each partition is done separately and then the ORDER BY clause is used to order the row in each partition. The ORDER BY clause in SELECT statement sorts the whole result set.
5. Using arbitrary expressions
CREATE TABLE A(p int, q int, r int);
INSERT INTO A VALUES (2, 1,-3),(3, 4, 4),(1, 3,NULL),(5, 7, 2),(4, 1,NULL),(6, 5, 3);
SELECT q, r,
LAG(2*r, q*(SELECT MIN(q)FROM A),-r/2.0)OVER (ORDER BY p)AS i
Explanation: In the above example, we have demonstrated a variety of arbitrary expressions using LAG function.So, we have created a table and inserted values and also used SELECT and LAG functions. As you can see the return value, offset and default value all are using expressions instead of just column names or scalar values.
Hopefully, now you know what LAG function is in the SQL server and how it is used to analyze previous rows in the query using physical offsets to give us useful insights.
This is a guide to SQL LAG(). Here we discuss an introduction, syntax, parameters and examples to implement with proper outputs. You can also go through our other related articles to learn more –