SQLCLR: Percentage User-Defined Aggregate Functions
Introduction
In this article we are going to show the use of a User Defined Function (UDF), developed using SQLCLR, which calculates the aggregate value of percentage operator.
This article (and all the code) based on This blog.
Our Case Study
We have a table (named 'PayRoll') that stores the salaries that employees received, according to their base salary. Over time, employees received pay increments, which are determined by a percentage of their current salary. All employees received the same percentage at the same time. Yet the PayRoll table does not include the actual salary. Data about the dates that the salaries increased, and the percentage of the increment, is stored in a second table (named 'Increase').
We need to get the actual salary of each employee at any time that he received his salary, including any salary rises
The Case study based on a question at MSDN forum.
**For example: **
If the employee received at 2016-02-27 a base payment of 10,000$, according to the PayRoll table, but one month before that at 2016-01-27 there were increments of 10% to the salary, then the actual payment that he received was 11.000$ (10,000 + 10% of 10,000). If there was another increment of 10% before the payment date, then the employee got 10% of the new value as a second increment, which means that at 2016-02-27 the actual payment was 12,100$.
In order to calculate the actual payment, we need to aggregate all the percentage rises that the employee received before the payment date (not summarize them, but execute them one by one on the previously calculated salary).
Preparation - DDL+DML:
Let's create our tables and insert some sample data.
/**************************************************************************** DDL */
create database TestDB
GO
use TestDB
GO
CREATE TABLE [dbo].[Increase]
(
[IncreaseID] [INT] IDENTITY(1, 1) NOT NULL,
[IncreaseDate] [DATE] NOT NULL,
[IncreasePercent] [MONEY] NOT NULL,
CONSTRAINT [PK_Increase] PRIMARY KEY CLUSTERED ( [IncreaseID] ASC )
)
GO
CREATE TABLE [dbo].[PayRoll]
(
[PayRollID] [INT] IDENTITY(1, 1) NOT NULL,
[EmployeeNo] [INT] NOT NULL,
[EmployeeName] [VARCHAR](8) NOT NULL,
[Month] [DATE] NOT NULL,
[Salary] [MONEY] NOT NULL,
CONSTRAINT [PK_PayRoll] PRIMARY KEY CLUSTERED ( [PayRollID] ASC )
)
GO
/**************************************************************************** DDL */
TRUNCATE TABLE [Increase]
INSERT [dbo].[Increase] ([IncreaseDate], [IncreasePercent]) VALUES (CAST(N'2016-04-01' AS Date), 5.0000)
INSERT [dbo].[Increase] ([IncreaseDate], [IncreasePercent]) VALUES (CAST(N'2016-09-01' AS Date), 10.0000)
INSERT [dbo].[Increase] ([IncreaseDate], [IncreasePercent]) VALUES (CAST(N'2016-11-01' AS Date), 7.0000)
GO
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-01-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-02-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-03-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-04-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-05-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-06-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-07-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-08-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-09-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-10-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-11-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (123, N'John Doe', CAST(N'2016-12-01' AS Date), 5000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-01-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-02-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-03-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-04-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-05-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-06-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-07-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-08-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-09-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-10-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-11-01' AS Date), 6000)
INSERT [dbo].[PayRoll] ([EmployeeNo], [EmployeeName], [Month], [Salary]) VALUES (456, N'Jane Doe', CAST(N'2016-12-01' AS Date), 6000)
GO
SELECT * FROM [Increase]
SELECT * FROM [PayRoll]
GO
Step-By-Step Solution
Step 1: SQLCLR code
Copy the code to new text file using simple text editor like notepad (Do not use document editor like ms-word!), and save it with the name TNWIKI_PercentageAgg.dll
01.using System;
02.using System.IO;
03.using System.Data.Sql;
04.using System.Data.SqlTypes;
05.using System.Text;
06.using Microsoft.SqlServer.Server;
07.using System.Reflection;
08.
09.[assembly: AssemblyVersion("0.0.0.1")]
10.[assembly: AssemblyFileVersion("0.0.0.1")]
11.[assembly: AssemblyDescription("Aggregate Percentage Function")]
12.[assembly: AssemblyCompany("Ronen Ariely")]
13.
14.namespace TNWIKI
15.{
16.
17. [Serializable]
18. [Microsoft.SqlServer.Server.SqlUserDefinedAggregate(
19. //use clr serialization to serialize the intermediate result
20. //Format.Native,
21. Format.UserDefined,
22. IsInvariantToNulls = true, //optimizer property
23. IsInvariantToDuplicates = false, //optimizer property
24. IsInvariantToOrder = true, //optimizer property
25. MaxByteSize = -1 //maximum size in bytes of persisted value
26. )]
27. public class TNWIKI_PercentageAggCls : Microsoft.SqlServer.Server.IBinarySerialize
28. {
29. private decimal IntermediateResult;
30.
31. public void Init()
32. {
33. this.IntermediateResult = 0;
34. }
35.
36. public void Accumulate(decimal _value)
37. {
38. IntermediateResult = (((IntermediateResult + 100) / 100) * _value); // Perc
39. }
40.
41. ///
42. public void Merge(TNWIKI_PercentageAggCls other)
43. {
44. this.IntermediateResult = (((this.IntermediateResult + 100) / 100) * other.IntermediateResult); // Perc
45. }
46.
47. public decimal Terminate()
48. {
49. return this.IntermediateResult;
50. }
51.
52. bool _isNull;
53. public bool IsNull
54. {
55. get { return _isNull; }
56. }
57.
58. void IBinarySerialize.Write(System.IO.BinaryWriter w)
59. {
60. w.Write(IsNull);
61.
62. if (!IsNull){
63. w.Write(IntermediateResult);
64. }
65. }
66.
67. void IBinarySerialize.Read(System.IO.BinaryReader r)
68.
69. _isNull = r.ReadBoolean();
70. if (!IsNull){
71. IntermediateResult = r.ReadDecimal();
72. }
73. }
74. }
75.}
Comments and explanation about the code above
Line 29: The variable IntermediateResult holds the intermediate value of the aggregation
Line 31: Initialize the internal data structures
Line 36: This method is the actual aggregation calculation. The input parameter is the current value (the column value in the current row).
Line 38: We need to calculate IntermediateResult percentage of the current value, and add it to the current value.
Line 42: Merge the partially computed aggregate, when there is parallel process, with this aggregate. We are using the same mathematical calculation as in the Accumulate method.
Line 47: At the end of the aggregation this method is called which returns the results of the aggregation.
Step 3: Create dynamic link library (DLL file)
- Open Command shell using the command cmd, Or open power shell and type cmd to use command shell
- run --> cmd
- Move to the Dont.Net framework directory according to the version that you want to use (the code works with any version from .Net 2.0)
- cd C:\Windows\Microsoft.NET\Framework64\v4.0.30319
- cd C:\Windows\Microsoft.NET\Framework64\v2.0.50727
- Compile the code using csc app:
csc.exe /target:library /out:"<path to the fcode file>\TNWIKI_PercentageAgg.dll" "<path to the fcode file>\TNWIKI_PercentageAgg.cs"
* This will create the DLL file in the same place that our source file is located.
Step 4: Enable CLR in the Server instance
Open SSMS and execute this statement
sp_configure 'clr enabled', 1
GO
RECONFIGURE
GO
Make sure that you fully understand the implication of enabling SQLCLR!
Step 5: Embed the DLL assembly code in the database.
Execute this statement using your path to the DLL file that you created
CREATE ASSEMBLY TNWIKI_PercentageAggAssembly
FROM '<path to the fcode file>\TNWIKI_PercentageAgg.dll'
WITH PERMISSION_SET = SAFE;
GO
Step 6: Create AGGREGATE function
Create the aggregate function, using the precision and scale that fit your needs (both in the input and in the return types).
CREATE AGGREGATE TNWIKI_PercentageAgg(@input decimal(32,2))
RETURNS decimal(32,2)
EXTERNAL NAME TNWIKI_PercentageAggAssembly.[TNWIKI.TNWIKI_PercentageAggCls];
GO
That is all :-)
Now we can use this function when ever we need it
Demonstration the use of our function
SELECT
o.PayRollID, o.EmployeeNo, o.EmployeeName, o.[Month], o.Salary
, finalSalary = o.Salary * (( (SELECT dbo.TNWIKI_PercentageAgg([IncreasePercent]) FROM [Increase] i where i.IncreaseDate <= o.[Month]) + 100 ) / 100 )
,increasePercentFromBase = (SELECT dbo.TNWIKI_PercentageAgg([IncreasePercent]) FROM [Increase] i where i.IncreaseDate <= o.[Month])
FROM PayRoll o
GO
Aggregate function can be achieved by using T-SQL query, but this solution will have to loop through the entire SET. It can be done using recursive CTE for example as can be seen in the original post in the forum, but using T-SQL will probably result is very poor performance and complex query!
Comments & Conclusions
Aggregate functions perform a calculation on a set of values and return a single value. Traditionally, Microsoft SQL Server has supported only built-in aggregate functions, but integration with Common Language Runtime (CLR) allows developers to create custom aggregate functions in managed code. SQL Server aggregates require four specific methods be implemented; Init, Accumulate, Merge, and Terminate. Aggregates created in managed code are called and executed like any other SQL Server aggregate.
Summary
In this article, we saw a simple and optimal solution to a complex problem, by creating new aggregate function. We explain the basic code of the CLR and how to implement it in SQL Server.
Resources and more information
- This article based on Ronen Ariely blog
- The case study for the article and the blog based on question at the MSDN forum
- Invoking CLR User-Defined Aggregate Functions
- Requirements for CLR User-Defined Aggregates
- SQL Server: Create Random String Using CLR