T-SQL : Average Interval Length
Introduction
I have been asked to design an algorithm which calculates “Average Interval Length” for any kind of transaction system, where user can define a Limit Value and condition to verify if the transaction is greater / less than the specified Limit Value
Problem Definition
User wanted to calculate “Average Interval Length” for any kind of transactional system, suppose we have table ‘Product’ & ‘Product_Sale’ which contains information regarding Product Catalog & Product Sales for each particular day respectively. Now user wants to calculate “Average Interval Length” from following formula;
Average Interval Length = Total No. Of Days / Total No. Of Intervals
Total No. Of Days Where Sales is Less or Greater Than the Particular Limit / Total No. Of Intervals (Interval is defined as No. Of Consecutive Days where Sales is Less / Greater Than the Particular Limit Value will consider as Single Interval, any break in Days will be counted as separate Interval)
Please find sample table structures which we would be working for this problem
CREATE TABLE #PRODUCT
(
PRODUCT_CODE INT
, PRODUCT_DESC VARCHAR(200)
)
CREATE TABLE #PRODUCT_SALE
(
PRODUCT_CODE INT
, INVOICE_DATE DATETIME
, PRODUCT_SELL_PRICE NUMERIC(18,8)
, PRODUCT_SELL_QTY INT
)
Please find the script to populate these sample tables with dummy data
DECLARE @PRODUCT INT
SET @PRODUCT = 1
WHILE (@PRODUCT <= 5)
BEGIN
INSERT INTO #PRODUCT (PRODUCT_CODE, PRODUCT_DESC)
SELECT @PRODUCT AS PRODUCT_CODE, 'Product' + TRY_CAST(@PRODUCT AS VARCHAR(10)) AS PRODUCT_DESC
DECLARE @INVOICE_DATE DATE
SET @INVOICE_DATE = DATEADD(dd, -9, CURRENT_TIMESTAMP)
WHILE (@INVOICE_DATE <= CURRENT_TIMESTAMP)
BEGIN
INSERT INTO #PRODUCT_SALE (PRODUCT_CODE, INVOICE_DATE, PRODUCT_SELL_PRICE, PRODUCT_SELL_QTY)
SELECT @PRODUCT AS PRODUCT_CODE, @INVOICE_DATE AS INVOICE_DATE, ROUND(100 * RAND() + 1, 2) AS PRODUCT_SELL_PRICE, ROUND(10 * RAND() + 1, 0) AS PRODUCT_SELL_QTY
SET @INVOICE_DATE = DATEADD(dd, 1, @INVOICE_DATE)
END
SET @PRODUCT = @PRODUCT + 1
END
Since this script will generate random sample data for each execution, here is a copy of the image of sample data for reference to this problem:
Solution
We use 'CTE' to first identify rows which passed the input criteria, either we want to evaluate it with Greater / Less Than Limit for particular Limit Value, in 'CTE2' we pick very first row of each Product by 'Invoice_Date' Ascending and then recurse through the 'CTE2' by identifying whether it is same interval or different and also increasing the 'No. Of Days' counter as well as managing the 'No. Of Intervals' counter, in final 'CTE3' we identify the final counter values for both 'No. Of Days' & 'No. Of Intervals' counter and using the formula above we calculate 'Average Interval Length'
Remember, 'Limit_Price' can be any numeric value which you want to set as your Limit Value while 'Limit_Condition' can be either '1' (for Less Than Equal to condition) or '2' (for Greater Than Equal to condition)
Provide these input parameters and you will be given the output, you can also use this code inside stored procedure as well
DECLARE @LIMIT_PRICE NUMERIC(18,8), @LIMIT_CONDITION INT
SELECT @LIMIT_PRICE = '72.00', @LIMIT_CONDITION = 1
--@LIMIT_CONDITION = 1 (Use 1 when you want to compare Limit Value Less Than or Equal To)
--@LIMIT_CONDITION = 2 (Use 2 when you want to compare Limit Value Greater Than or Equal To)
;WITH CTE AS
(
SELECT P.PRODUCT_CODE, P.PRODUCT_DESC, PS.INVOICE_DATE, PS.PRODUCT_SELL_PRICE, @LIMIT_PRICE AS LIMIT_PRICE
, ROW_NUMBER() OVER(PARTITION BY PS.PRODUCT_CODE ORDER BY INVOICE_DATE ASC) AS Rn
FROM #PRODUCT P
INNER JOIN #PRODUCT_SALE PS ON PS.PRODUCT_CODE = P.PRODUCT_CODE
WHERE 1 = CASE WHEN @LIMIT_CONDITION = 1 AND PS.PRODUCT_SELL_PRICE <= @LIMIT_PRICE THEN 1
WHEN @LIMIT_CONDITION = 2 AND PS.PRODUCT_SELL_PRICE >= @LIMIT_PRICE THEN 1
END
), CTE2 AS
(
SELECT C.PRODUCT_CODE, C.PRODUCT_DESC, C.INVOICE_DATE, C.PRODUCT_SELL_PRICE, C.LIMIT_PRICE, C.Rn, 1 AS DAY_COUNT, 1 AS INTERVAL
FROM CTE C
WHERE C.Rn = 1
UNION ALL
SELECT C.PRODUCT_CODE, C.PRODUCT_DESC, C.INVOICE_DATE, C.PRODUCT_SELL_PRICE, C.LIMIT_PRICE, C.Rn
, C2.DAY_COUNT + 1 AS DAY_COUNT
, CASE WHEN C.INVOICE_DATE = DATEADD(dd, 1, C2.INVOICE_DATE) THEN C2.INTERVAL ELSE C2.INTERVAL + 1 END AS INTERVAL
FROM CTE2 C2
INNER JOIN CTE C ON C.PRODUCT_CODE = C2.PRODUCT_CODE AND C.Rn = C2.Rn + 1
), CTE3 AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY PRODUCT_CODE ORDER BY Rn DESC) AS FinalRn
FROM CTE2
)
SELECT C3.PRODUCT_CODE, C3.PRODUCT_DESC, C3.INVOICE_DATE, C3.PRODUCT_SELL_PRICE
, CASE WHEN @LIMIT_CONDITION = 1 THEN 'LESS THAN'
WHEN @LIMIT_CONDITION = 2 THEN 'GREATER THAN'
END AS LIMIT_CONDITION
, C3.LIMIT_PRICE, C3.DAY_COUNT, C3.INTERVAL, (C3.DAY_COUNT / CAST(C3.INTERVAL AS NUMERIC(8,2))) AS AVG_INTERVAL_LENGTH
FROM CTE3 C3
WHERE C3.FinalRn = 1
ORDER BY PRODUCT_CODE, INVOICE_DATE
Here is our desired result
Conclusion
This article demonstrated solution for a common problem of Average Interval Length using set-based approach, and provided good example for Recursive CTE.