Soften the RBAR impact with Native Compiled UDFs in SQL Server 2016
Reviewers: Joe Sack, Denzil Ribeiro, Jos de Bruijn
Many of us are very familiar with the negative performance implications of using scalar UDFs on columns in queries: my colleagues have posted about issues here and here. Using UDFs in this manner is an anti-pattern most of us frown upon, because of the row-by-agonizing-row (RBAR) processing that this implies. In addition, scalar UDF usage also limits the optimizer to use serial plans. Overall, evil personified!
Native Compiled UDFs introduced
Though the problem with scalar UDFs is well-known, we still come across workloads where this problem is a serious detriment to the performance of the query. In some cases, it may be easy to refactor the UDF as an inline Table Valued Function, but in other cases, it may simply not be possible to refactor the UDF.
SQL Server 2016 offers natively compiled UDFs, which can be of interest where refactoring the UDF to a TVF is not possible, or where the number of referring T-SQL objects are simply too many. Natively compiled UDFs will NOT eliminate the RBAR agony, but they can make each iteration incrementally faster, thereby reducing the overall query execution time. The big question is how much?
Real-life results
We recently worked with an actual customer workload in the lab. In this workload, we had a query which invoked a scalar UDF in the output list. That means that the UDF was actually executing once per row – in this case a total of 75 million rows! The UDF has a simple CASE expression inside it. However, we wanted to improve query performance so we decided to try rewriting the UDF.
We found the following results with the trivial UDF being refactored as a TVF versus the same UDF being natively compiled (all timings are in milliseconds):
Interpreted (classic) |
Native compiled (new in SQL Server 2016) |
TVF |
|
CPU Time |
12734 |
8906 |
3735 |
Elapsed time |
13986 |
8906 |
3742 |
As can be expected, the TVF approach is the fastest, but it is encouraging that the native compiled UDF reduced execution time by solid 36% even though the logic in the UDF was very trivial!
Test Scenario
In order to take this further, I decided to do some testing with a neutral workload. For this neutral test I used the DB1BTicket table which is 230+ million rows and in my test database had a Clustered Columnstore Index created on it.
In this test, I used two different types of UDFs: a trivial one and another one which has more conditional statements in it. The scripts for these are at the end of the post. The results with these different iterations are summarized in the table below:
Interpreted (classic) |
Native compiled (new in SQL Server 2016) |
TVF |
|
Simple UDF |
1672.239 seconds |
796.427 seconds |
10.473 seconds |
Multi-condition UDF |
3763.584 seconds |
848.106 seconds |
Not attempted |
Side Note: Parallelism
It is well known that UDFs (even those which do not access data and just do computation) cause a serial plan to be used. Here is the plan with interpreted UDF – as you can see it is serial:
Here is the plan with native compiled UDF – it is still serial:
Lastly, here is the execution plan with TVF – as you can see it is a parallel plan:
Here’s the million-dollar question to you: how badly do you want SQL Server to support parallel plans when UDFs are used anywhere in the plan? Do send us your feedback as comments.
Conclusion
While refactoring the scalar UDF as a TVF ultimately provided the best results, in cases where it is not possible to do this, using native compiled UDFs provides a very useful reduction in query execution time. Therefore, native compiled UDFs can be used as a mitigation or even considered a solution to those thorny cases where RBAR is absolutely necessary.
Appendix: UDF Definitions
Here is the simple UDF, in the classic T-SQL interpreted form:
CREATE FUNCTION dbo.FarePerMile ( @Fare MONEY, @Miles INT )
RETURNS MONEY
WITH SCHEMABINDING
AS
BEGIN
DECLARE @retVal MONEY = ( @Fare / @Miles );
RETURN @retVal;
END;
GO
Here is the simple UDF written as a native compiled version:
CREATE FUNCTION dbo.FarePerMile_native (@Fare money, @Miles int)
RETURNS MONEY
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
DECLARE @retVal money = ( @Fare / @Miles)
RETURN @retVal
END
Here is the simple UDF refactored as a TVF:
CREATE FUNCTION dbo.FarePerMile_TVF ( @Fare MONEY, @Miles INT )
RETURNS TABLE
AS
RETURN
SELECT ( @Fare / @Miles ) AS RetVal;
Now, the multiple condition UDF, in the classic T-SQL interpreted form:
CREATE FUNCTION dbo.FictionalPricingLogic
(
@RPCarrier VARCHAR(2) ,
@Origin VARCHAR(3) ,
@Fare MONEY ,
@Miles INT
)
RETURNS MONEY
WITH SCHEMABINDING
AS
BEGIN
DECLARE @retVal MONEY;
DECLARE @discount MONEY = 0; -- discount percentage
IF ( @RPCarrier = 'DL' )
SELECT @discount += 0.05;
IF ( @RPCarrier = 'AA' )
SELECT @discount += 0.05;
IF ( @Origin = 'DFW' )
SELECT @discount += 0.01;
IF ( @Origin = 'SEA' )
SELECT @discount += 0.009;
IF ( @Miles > 500 )
SELECT @discount += 0.01;
SELECT @retVal = @Fare * ( 1.0 - @discount );
RETURN @retVal;
END;
Here is the multiple condition UDF written as a native compiled version:
CREATE FUNCTION dbo.FictionalPricingLogic_Native
(
@RPCarrier VARCHAR(2) ,
@Origin VARCHAR(3) ,
@Fare MONEY ,
@Miles INT
)
RETURNS MONEY
WITH native_compilation
,schemabinding
,EXECUTE AS OWNER
AS
BEGIN
atomic
WITH (
TRANSACTION ISOLATION LEVEL = snapshot
,LANGUAGE = N'us_english'
)
DECLARE @retVal MONEY;
DECLARE @discount MONEY = 0; -- discount percentage
IF ( @RPCarrier = 'DL' )
SELECT @discount += 0.05;
IF ( @RPCarrier = 'AA' )
SELECT @discount += 0.05;
IF ( @Origin = 'DFW' )
SELECT @discount += 0.01;
IF ( @Origin = 'SEA' )
SELECT @discount += 0.009;
IF ( @Miles > 500 )
SELECT @discount += 0.01;
SELECT @retVal = @Fare * ( 1.0 - @discount );
RETURN @retVal;
END;
In this test, assume that it was not worth refactoring the multiple condition UDF as a TVF.
Appendix: Test Queries
Here are the sample queries used to test the performance of each of the above UDF variations:
SET STATISTICS TIME ON;
SET STATISTICS IO ON;
SELECT AVG(dbo.FarePerMile(ItinFare, MilesFlown))
FROM DB1BTicket;
GO
SELECT AVG(dbo.FarePerMile_Native(ItinFare, MilesFlown))
FROM DB1BTicket;
GO
The below query is the version with the TVF. Note the usage of CROSS APPLY:
SELECT AVG(myTVF.RetVal)
FROM DB1BTicket
CROSS APPLY dbo.FarePerMile_TVF(ItinFare, MilesFlown) AS myTVF;
GO
SELECT AVG(dbo.FictionalPricingLogic(RPCarrier, Origin, ItinFare, MilesFlown))
FROM DB1BTicket;
GO
SELECT AVG(dbo.FictionalPricingLogic_Native(RPCarrier, Origin, ItinFare,
MilesFlown))
FROM DB1BTicket;
GO
Comments
- Anonymous
February 17, 2016
"Here’s the million-dollar question to you: how badly do you want SQL Server to support parallel plans when UDFs are used anywhere in the plan?"Here's my two-bit question to you: how many years has Microsoft had to actually fix this terrible performance pitfall by now, or even just incrementally improve on it? Almost two decades, by my measure. Whole careers have peaked and declined while the optimizer remains powerless in the face of even the most trivial deterministic functions. Meanwhile, innocent programmers from outside the world of SQL still see a function and think "nice, a chance to factor out my logic" before they learn the sad truth the hard way: the implementation of scalar-valued UDFs is so terrible that they should never be used if more than one row is involved.I know all too well that products in the real world contend with resource limits and deadlines and the question "why hasn't this been fixed yet" is rarely meaningful (or polite!) when asked by an outsider, but in this case I feel there is just no choice: we have to ask "why hasn't this been fixed yet". Because by Jove, this is no way for an enterprise database engine to behave, it's a disgrace. - Anonymous
February 19, 2016
Adding parallel plans support when UDFs are used will be great and a good reason to upgrade SQL Server where UDFs are wide spread.In our customers we have to convert UDFs to TVFs very frequently to address performance problems. Typically, time/budget restrictions make you to focus just on the “most offending” UDFs. Users tend to encapsulate a lot of business/common logic into UDFs (the same way they do it in their applications) and they find frustrating how badly they perform when they use them in complex queries, in their reports and so on. It will be also very useful to have, as alternative to UDFs in some scenarios, something like C macros so you can encapsulate easily special formatting, data type conversions, some simple calculations and so on into a macro instead of recurring to UDFs (sacrificing performance) or to “copy&paste” the code everywhere (sacrificing maintainability). - Anonymous
February 20, 2016
A wrong code has been listed under" "Here is the multiple condition UDF written as a native compiled version:" (it is the same as the classic version) - Anonymous
February 24, 2016
In 'Appendix: UDF Definitions' the definition for FictionalPricingLogic_Native is identical to that for FictionalPricingLogic. It's missing the natively compiled decorations. - Anonymous
March 10, 2016
The comment has been removed - Anonymous
March 16, 2016
It's nice to see the improvement with native compiled UDFS. As for the question about parallelism, it is pretty important. But I have to agree with JM's comments; the far more important question is why not invest some resources in making scalar interpreted UDFs that have only a RETURN inline?Cheers,Itzik - Anonymous
September 28, 2016
Your article doesn't mention an example of when a scalar function cannot be re-written as an inline table function. I almost always use inline functions to rewrite re-usable code. As I already knew and your article demonstrates, it leaves the optimizer completely free to make the best access plan. They are just as macros. I've rewritten plenty of udf scalar functions into inline functions, and people just don't attempt to convert them as soon as they see variables in them. Here an example how FictionalPricingLogic scalar UDF could be translated to into an inline function. Every variable can be redefined as a column that is the output of a CTE and reused in the select statement that is going to produce the result. CREATE FUNCTION dbo.FictionalPricingLogic_TVF( @RPCarrier VARCHAR(2) , @Origin VARCHAR(3) , @Fare MONEY , @Miles INT)RETURNS TableASReturn(With Prm as (select @RpCarrier as RpCarrier, @Origin as Origin, @Fare as Fare, @Miles as Miles), Discounts as ( Select * From Prm Cross Join ( Values ('Carrier', 'DL', 0.05), ('Carrier', 'AA', 0.05) , ('Origin', 'DFW', 0.01), ('Origin', 'SEA', 0.009) , ('Distance', NULL, 0.01) ) as DiscountTable (DiscountType, Code, discount) ), IndividualDiscounts As ( Select discount From Discounts Where DiscountType = 'Carrier' And Code = RpCarrier UNION ALL Select discount From Discounts Where DiscountType = 'Origin' And Code = Origin UNION ALL Select discount From Discounts Where DiscountType = 'Distance' And Miles > 500 ), SumOfDiscounts (TotalDiscount) As (Select ISNULL(Sum(Discount),0.0) From IndividualDiscounts)Select Fare * (1.0 - TotalDiscount) as DiscountedPriceFrom Prm Cross Join SumOfDiscounts )goSELECT AVG(myTVF.DiscountedPrice)FROM DB1BTicket CROSS APPLY dbo.FictionalPricingLogic_TVF(ItinFare, MilesFlown) AS myTVF;GO