Be aware of the difference in isolation levels if porting an application from Windows Azure SQL DB to SQL Server in Windows Azure Virtual Machine

[One more post recovered from SQLCAT.com archives.]

 

Author: Sanjay Mishra, Tim Wieman

Technical Reviewers: Mike Weiner, Shep Sheppard, Prem Mehra, Chuck Heinzelman, James Podgorski, Silvano Coriani, Madhan Arumugam Ramakrishnan, Sunil Agarwal, Guy Bowerman

 

Windows Azure SQL Database and SQL Server in Windows Azure Virtual Machine are two different data storage options and are best suited for slightly different application scenarios. Refer to the published articles on MSDN for a comparison and how to choose between these options.

There are certain differences between the two storage options. One important different is the isolation level of the database. Depending upon application design, difference in isolation level can impact the application concurrency, and ultimately, application performance and throughput.

It is not a common practice to port databases from Windows Azure SQL Database to SQL Server in a Windows Azure VM environment. However, for some specific unique reasons, not covered here, one of our customers performed such a migration and the only change made was to change the connection string.

During the load test after the porting, we noticed that the performance of the application on SQL Server in Windows Azure VM did not match up to the performance of the application on Windows Azure SQL Database. Upon further investigation, we found much higher volume of lock waits under SQL Server in Windows Azure VM.

The key difference that impacted the application behavior is the isolation level of the two databases. On Windows Azure SQL Database, the isolation level settings for READ COMMITTED SNAPSHOT and SNAPSHOT ISOLATION are turned on:

select name, snapshot_isolation_state, is_read_committed_snapshot_on from sys.databases

name  snapshot_isolation_state  is_read_committed_snapshot_on
CustomerDB  1  1

 

On a SQL Server database (running on Windows Azure Virtual Machine, or on-premises), the isolation level settings for READ COMMITTED SNAPSHOT and SNAPSHOT ISOLATION are turned off, by default:

select name, snapshot_isolation_state, is_read_committed_snapshot_on from sys.databases

name  snapshot_isolation_state  is_read_committed_snapshot_on
CustomerDB  0  0

The customer application was designed and developed on Windows Azure SQL Database and it did make use of READ COMMITTED SNAPSHOT ISOLATION (RCSI).

The performance comparison of the two data storage options, under default settings:

Data Platform  Application Response Time (sec)  Application Throughput (web pages/sec)
Windows Azure SQL Database  0.83  38
SQL Server in Windows Azure VM (default)  2.94  13.9

As you can see, the application throughput for SQL Server in Windows Azure VM was almost one-third of the throughput with the database on Windows Azure SQL Database.  One of the important goals of the load testing was to ensure application response time under 2 seconds. For SQL Server in Windows Azure VM, under default settings, the application response time was more than the 2 second threshold the customer had set. Analyzing waitstats perfmon counters (counter “Lock Waits” under object => “SQL Server:Wait Statistics”, instance => “Average wait time (ms)”), we observed about 1 sec average wait time for locks.

With RCSI turned on, the average lock wait time came down to a few milliseconds, application throughput significantly improved, and the application response time significantly improved.

Data Platform  Application Response Time (sec)  Application Throughput (web pages/sec)
SQL Server in Windows Azure VM (default)  2.94 13.9
SQL Server in Windows Azure VM (with RCSI)  1.14  34.8

 

Important Note:  When turning on RCSI, be aware that SQL Server maintains the version store in the TEMPDB database, and you may observe significant activity in TEMPDB.  Therefore, appropriate capacity and throughput must be planned for TEMPDB.
For example, for this specific scenario, after enabling RCSI, the increased throughput demands of TEMPDB necessitated scaling out TEMPDB data files to multiple data disks.  Following is the TEMPDB disk I/O throughput comparison before and after RCSI:

Data Platform  Disk Read Bytes / sec  Disk Write Bytes / sec
SQL Server in Windows Azure VM (default)  403,031  1,504,410
SQL Server in Windows Azure VM (with RCSI)  26,570,531  58,219,559

 

Further Reading:

Comments

  • Anonymous
    June 04, 2014
    Nice Article explaining the fine print w.r.t Windows Azure SQL Database & the Windows Azure VM SQL DB. Thanks Sanjay!