Preliminary findings to get good performance of SqlBulkCopy
We want to develop a C# application which needs load data into local DB quickly, and found the initially implemented SqlBulkCopy has poor performance (only 14 MB/S). Here are the findings when I try to improve the performance.
Test BaseLine:
- Implement a mock data reader. The pure read throughput is several GB/S, which won’t be the bottleneck for DB insertion.
- The schema is 4 columns of string, and each field is a Guid.
- The data size is set as 1.6 GB with around 10 million rows.
- Use SqlBulkCopy.WriteToServerAsync(reader, token) to insert data, and hard code BatchSize as 1000.
- The throughput is 14 MB/S.
Improvement 1:
- Use the sync API in SqlBulkCopy instead of the async API.
- The improved throughput is 30 MB/S.
- The performance of sync API is better than the async one, which is same as my previous finding in SqlDataReader.
Improvement 2:
- Use an adaptive BatchSize logic based on row size, instead of a hard coded value.
- My logic is to set the BatchSize as (8 MB)/(row size), where the row size can be estimated in our application.
- The further improved throughput is 42 MB/S.
Improvement 3:
- Set the target database as bulk-logged recovery mode, and use table lock for SqlBulkCopy. Refer to this article: https://msdn.microsoft.com/en-us/library/ms190422%28v=sql.105%29.aspx
- The further improved throughput is 60 MB/S.
Row size and Performance:
- Did some tests based on different schema. The performance can reach to 60 MB/S if the row size is larger than 100 bytes.
- For those very small rows (e.g. only contain 4 int value), the performance will be downgraded to 35 MB/S.