Benchmarking Index Fragmentation With Popular Sequential GUID Algorithms
In a multi-tenant data insert scenario, GUID is a great data type for primary key. Since it is unique globally, you will not need to rely on any range or custom logic. However, if you extend these columns as index, you will end up with frequent index fragmentation and hence bad performance. In this post, I will be benchmarking index fragmentation with popular sequential Guid algorithms.
Option 1 (GUID using NewGuid): In this option, we will create a GUID from the application and then insert the same into the table. We will use this as the base benchmark for the comparison. We are already aware of the fact that this option ends up in huge defragmentation. This option is no different from calling NewId() function at SQL.
Option 2 (Sequential GUID using SQL’s NewSequentialId): In this option, we will let the SQL to create the sequential GUID using the inbuilt function.
Option 3 (Sequential GUID based on UuidCreateSequential): In this option, we will write an algorithm which is equivalent to NewSequentialId. This will use the UuidCreateSequential; a COM system method. NewSequentialId also shares the same COM method. This algorithm is posted by David Browne (fellow MS employee) and his original post could be found at https://blogs.msdn.microsoft.com/dbrowne/2012/07/03/how-to-generate-sequential-guids-for-sql-server-in-net/
Option 4 (GUID based on NHibernate’s GuidCombGenerator): In this option, we will use the algorithm NHibernate uses to generate a GUID for certain features. Many developers have claimed to use this algorithm to overcome the issue we are discussing. The source code for the same could be found at https://github.com/nhibernate/nhibernate-core/blob/3087d48640eb64f573c0011e9a9b1567afe6adde/src/NHibernate/Id/GuidCombGenerator.cs
Output
There was latency of 220ms to the Azure data center from my machine.
Option 1(Guid using NewGuid) | Option 2(Sequential Guid using SQL’s NewSequentialId) | Option 3(Sequential Guid based on UuidCreateSequential) | Option 4(Guid based on NHibernate’s GuidCombGenerator) | |
---|---|---|---|---|
Duration | 346s | 300s | 343s | 359s |
Application Memory | 22 – 23MB (No Leak) | 17 – 19MB (No Leak) | 21 – 23MB (No Leak) | 21 – 23MB (No Leak) |
Application CPU (8 Core) | < 1% | < 1% | < 2% | < 2% |
DTU Consumption (DTU / CPU / Data / Log) | 62 / 44 / 0 / 62 | 56 / 29 / 0 / 56 | 47 / 42 / 0 / 47 | 59 / 41 / 7 / 59 |
Pages (Clustered Index / Non-Clustered Index) | 14335 / 13467 | 9901 / 10527 | 9901 / 10527 | 15868 / 14994 |
Page Density % (Clustered Index / Non-Clustered Index) | 68.9 / 68.75 | 99.8 / 99.73 | 99.8 / 99.73 | 66.16 / 65.89 |
Fragmentation % (Clustered Index / Non-Clustered Index) | 99.27 / 99.03 | 1.87 / 0.76 | 1.88 / 0.77 | 91.66 / 91.49 |
If you are interested in the complete benchmark process, code I have used and my suggestion, please visit my full article posted at https://microsoftprogrammers.jebarson.com/benchmarking-index-fragmentation-with-popular-sequential-guid-algorithms/