Why massive inserts using SubmitChanges lack in performance.

Bulk inserting using LINQ to SQL, why is it so slow?

Well, the short answer is that it is not designed to do bulk inserts.

But out of curiosity, why is it slow?

Well, let us do it by example (my preferred way). First create a table in SQL Server like so (this will basically represent a person):

create table OurLinqPerson (cid int primary key, fname nvarchar(10), lname nvarchar(10), age int)

--drop table OurLinqPerson

Then create a new C# console project in Visual Studio.

Right click the project and add a new item, “LINQ to SQL Classes”, call it Person.dbml.

Then use the Server Explorer to find the table (OurLinqPerson) and drag it to the designer surface and “Save”.

Then add the following code:

        static void Main(string[] args)

        {

            int noOfPersonsToInsert = 10000;

            long calc = 0;

            try

            {

                PersonDataContext pdc = new PersonDataContext();

                Stopwatch sw = new Stopwatch();

                sw.Start();

                for (int i = 0; i < noOfPersonsToInsert; i++)

                {

                    OurLinqPerson olp = new OurLinqPerson { cid = i, fname = "Peter", lname = "Peterson", age = 33 };

                    // Add the object to the datacontext.

                    pdc.OurLinqPersons.InsertOnSubmit(olp);

                    if (i % 500 == 0)

                    {

                        Console.WriteLine("Time (ms): Total / This batch {0, 3} / {1, 3}, Objects using InsertOnSubmit: {2}", sw.ElapsedMilliseconds.ToString(), (sw.ElapsedMilliseconds - calc).ToString(), i);

       calc = sw.ElapsedMilliseconds;

                    }

                }

                Console.WriteLine("All InsertOnSubmits done. Total elapsed time (ms): {0}. \n\nNow calling SubmitChanges...", sw.ElapsedMilliseconds.ToString());

      calc = sw.ElapsedMilliseconds;

               

                // Now submit changes to the database.

                pdc.SubmitChanges();

               

                Console.WriteLine("SubmitChanges done, time for this (ms): {0}", (sw.ElapsedMilliseconds - calc).ToString());

                sw.Stop();

                Console.WriteLine("\n\nTotal elapsed time (ms): {0}", sw.ElapsedMilliseconds.ToString());

            }

            catch (Exception ex)

            {

                Console.WriteLine(ex);

            }

        }

The code hopefully is self-explanatory. We want to insert 10000 rows into the OurLinqPerson table, we do this by creating an instance of the OurLinqPerson object.

(Basically, a row in the table in the database is an instance of a class representing a row, the columns are properties on that class).

We then add that instance to the table in the DataContext by using it as an argument to the InsertOnSubmit method.

We write out the time elapsed and the time it took for doing this for every 500 rows.

Finally we submit the changes to the database by calling SubmitChanges.

So, run the application and the output should be something like this:

Time (ms): Total / This batch 23 / 23, Objects using InsertOnSubmit: 0

Time (ms): Total / This batch 27 / 4, Objects using InsertOnSubmit: 500

Time (ms): Total / This batch 30 / 3, Objects using InsertOnSubmit: 1000

Time (ms): Total / This batch 33 / 3, Objects using InsertOnSubmit: 1500

Time (ms): Total / This batch 44 / 11, Objects using InsertOnSubmit: 2000

Time (ms): Total / This batch 47 / 3, Objects using InsertOnSubmit: 2500

Time (ms): Total / This batch 50 / 3, Objects using InsertOnSubmit: 3000

Time (ms): Total / This batch 53 / 3, Objects using InsertOnSubmit: 3500

Time (ms): Total / This batch 58 / 5, Objects using InsertOnSubmit: 4000

Time (ms): Total / This batch 61 / 3, Objects using InsertOnSubmit: 4500

Time (ms): Total / This batch 65 / 3, Objects using InsertOnSubmit: 5000

Time (ms): Total / This batch 73 / 8, Objects using InsertOnSubmit: 5500

Time (ms): Total / This batch 76 / 3, Objects using InsertOnSubmit: 6000

Time (ms): Total / This batch 80 / 4, Objects using InsertOnSubmit: 6500

Time (ms): Total / This batch 83 / 3, Objects using InsertOnSubmit: 7000

Time (ms): Total / This batch 87 / 4, Objects using InsertOnSubmit: 7500

Time (ms): Total / This batch 90 / 3, Objects using InsertOnSubmit: 8000

Time (ms): Total / This batch 94 / 4, Objects using InsertOnSubmit: 8500

Time (ms): Total / This batch 97 / 3, Objects using InsertOnSubmit: 9000

Time (ms): Total / This batch 100 / 3, Objects using InsertOnSubmit: 9500

All InsertOnSubmits done. Total elapsed time (ms): 104.

Now calling SubmitChanges...

SubmitChanges done, time for this (ms): 3396

Total elapsed time (ms): 3500

So it is pretty obvious that adding the rows/objects performs quickly, around 5 ms to insert 500 of them, however, the call to SubmitChanges takes 3396 ms out of a 3500 total.

So, back to the original questions, why is it slow?

Well, when calling SubmitChanges every object/row in the table (OurLinqPersons) has to be inspected in order to decide what action to take on that particular row.

Once the Object State has been determined for the instance it will create the appropriate SQL for that state and then call the database with this SQL.

The rows can be in several states, for example, ToBeInserted, Unchanged, Deleted etc.

In this case the row is in the ToBeInserted state and so the following SQL is created and ran for each row:

INSERT INTO [dbo].[OurLinqPerson]([cid], [fname], [lname], [age]) VALUES (@p0, @p1, @p2, @p3)

So in short, the reason it takes time is because every row/object has to be inspected in order to decide what action to take on it and then that action has to be executed.

It doesn’t matter if the row/object is unchanged; it still has to be inspected.

If you want to insert a large number of row, pure ADO.Net (for example SqlBulkCopy is a better option).

"Table<TEntity>.InsertOnSubmit Method"

https://msdn.microsoft.com/en-us/library/bb763516.aspx

"DataContext.SubmitChanges Method "

https://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.submitchanges.aspx

"Object States and Change-Tracking (LINQ to SQL)"

https://msdn.microsoft.com/en-us/library/bb386982.aspx

"Insert, Update, and Delete Operations (LINQ to SQL)"

https://msdn.microsoft.com/en-us/library/bb386931.aspx

Comments

  • Anonymous
    November 14, 2010
    Why is the inspection for every row/object so expensive? Is a call to the SQL server made?  It would be nice to be able to turn this inspection off when necessary.

  • Anonymous
    April 22, 2011
    I short - just use transaction and performance will skyrocket.