Performance Impact of Server-side Generated GUIDs in EF
The information in this post is out of date.
Visit msdn.com/data/ef for the latest information on current and past releases of EF.
For Performance Considerations see https://msdn.com/data/hh949853
Background
EF4 supports the use of server-generated GUIDs, which has been a popular request since EF 3.5. Lee Dumond summarized his experience on how to use this feature very well. But you may find that the performance of server-generated GUIDs is not as fast as client-side generated GUIDs or even the server side Identity columns of other types like int, smallint etc. I would like to take a few moments to explain why that is and how we are going to improve it.
Performance Analysis
First of all, we need to take a look at the SQL script that is generated by EF. Based on the types of server side generated key column, there are two variations of SQL as listed below,
1. Identity type supported by Scope_Identity(), such as int, bigint and smallint, etc. For example in the statement below, the Id is an int.
insert [dbo].[Employee]([Name]) values ('Li') select [Id] from [dbo].[Employee] where @@ROWCOUNT > 0 and [Id] = scope_identity() |
2. Identity type not supported by Scope_Identity(), such as uniqueidentifier. For instance the Id in the statement below is a uniqueidentifier.
declare @generated_keys table([Id] uniqueidentifier) insert [dbo].[Customer]([Name]) output inserted.[Id] into @generated_keys values ('Lei') select t.[Id] from @generated_keys as g join [dbo].[Customer] as t on g.[Id] = t.[Id] where @@ROWCOUNT > 0 |
To measure the relative execution time, we run those two insert statements in one batch with Actual Execution Plan. As you can see in the diagram below, the second batch of SQL runs twice slower than the first batch.
Reason
You may think why not just get rid of the table variable in the second SQL script (as shown in the table below) to make it faster.
insert [dbo].[Customer]([Name]) output inserted.[Id] values ('Lei') |
The reason is that OUTPUT clause without INTO in update statements can be problematic when you have triggers defined on the table. You can read more about this here and here. Basically we want to make sure when the table has triggers on insert, our generated SQL will still work.
Workaround and Future Improvements
If you want to have better performance on the insert, you can always generate the GUID on the client side.
We would like to improve the performance in server-generated GUID scenarios by supporting Function in the DefaultValue attribute on Property definition in SSDL, so that user can define a store function “newid()” in SSDL and reference it in the DefaultValue of the Id Property on the Customer table. EF will then generate the following SQL which improves the performance considerably (50% faster).
declare @generated_key uniqueidentifier set @generated_key = newid() insert [dbo].[Customer]([Id],[Name]) values (@generated_key, 'Lei') select @generated_key as [Id] |
Here is the Actual Execution Plan from running both the new SQL script and the original SQL script.
Lei Li
Developer, Entity Framework Team
Comments
Anonymous
July 05, 2010
But you have not taken into account the fact that using newid() will generate page splits, which will also hurt performance. If you use the OUTPUT clause with a newsequentialid() in a DEFAULT you would prevent these splits.Anonymous
July 06, 2010
I tried to write this comment a week ago but it didn't go through for some reason....
- As Nils mentioned, using newid() causes fragmentation and is generally bad for db performance. if linq supported database defaults, we could use newsequentialid() with it which solves this problem.
- I don't understand why there's a join to the destination table when you use identity or uniqueidentifier - you get the data from either the output clause or the scope_identity() function. if you eliminate the join, you can get a little extra in performance.
- I know it's not the prettiest solution, but if we had a switch that tells linq to produce code with output into clause and support for triggers, or to produce code with a simple output clause (and no support for triggers) - wouldn't it just solve this problem? Cheers, Saggi
Anonymous
July 21, 2010
Guids are indeed a problem, but when using EF this implies you do model driven design and the reason why guids are used is to be able to create links (e.g. 1:n parent child) at client-side or the server-side business logic before saving the object graph. SQLServer generated guids will not help you here. What we use is an algorith that simulates the SQLServer logic generating 'sequential guids' thus eliminating page splits and yet allow pre-save model composition.Anonymous
July 28, 2010
Will the new feature allow the use of newsequentialid ? We already use this in all our EF insert functions to generate server side GUIDs.