Querying a Database Snapshot
Authors: Sanjay Mishra, Michael Thomassy, Peter Byrne
Database Snapshots (https://msdn2.microsoft.com/en-us/library/ms175158.aspx) in SQL Server 2005 Enterprise Edition allow a read-only point-in-time view of data. Recently we encountered an interesting scenario with a customer implementation. The customer creates a database snapshot. Then a query is executed on the database snapshot. The first time a set of queries are executed, they take 17 minutes. Every subsequent execution of the same queries takes only 20 seconds. Obviously, the first execution required reading the data from the disk (physical reads) and takes a longer time, whereas, the second and subsequent execution of the query finds the data in the buffer pool (logical reads), and therefore takes less amount of time. There is nothing unusual here.
What is interesting is that right before querying the database snapshot, the source database was queried for exactly the same data, so essentially, the source database pages required to satisfy the query are already in the buffer pool, by the time we execute the first query on the database snapshot. So, the customer expected that, since the query against the database snapshot actually reads data from the source database (in this case, since the data has not changed after the database snapshot was created), it should have used the pages in the buffer pool, instead of doing the physical reads. But it didn’t.
The reason lies in the way the queries work on a database snapshot. In this blog we will discuss some of the internals that will throw some light on this. This blog assumes an understanding of database snapshots.
Creating the Database Snapshot
We will analyze the behavior by creating a database snapshot on the AdventureWorks database:
CREATE DATABASE [AdventureWorksSnapshot] ON
(NAME=[AdventureWorks_Data],
FILENAME=N'C:AdventureWorks_DataSnapshot.mdf')
AS SNAPSHOT OF [AdventureWorks];
So, for the example discussed here, AdventureWorks is the “source database” and AdventureWorksSnapshot is the “database snapshot”. Note that the database snapshot has its own data file (AdventureWorks_DataSnapshot.mdf), which is a Windows sparse file (refer to Books Online for more details).
For our test, no changes have been done to the source database after the database snapshot was created. Therefore, the sparse file of the database snapshot is empty. At this time, if you run a query on the database snapshot, SQL Server will read the data from the data file of the source database.
Querying the Source Database
Now, let’s execute a query on the source database. But before we do that we need to clear the buffer cache by issuing dbcc dropcleanbuffers.
dbcc dropcleanbuffers
Let’s also turn on the io and time statistics.
set statistics io on
set statistics time on
Now, let’s execute the following query one time:
select * from AdventureWorks.Sales.SalesOrderDetail d
inner join AdventureWorks.Sales.SalesOrderHeader h
on d.SalesOrderID=h.SalesOrderID
where customerid=676;
The io and time statistics gives us the following output:
SQL Server parse and compile time:
CPU time = 62 ms, elapsed time = 364 ms.
(359 row(s) affected)
Table 'SalesOrderDetail'. Scan count 12, logical reads 41, physical reads 18, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 38, physical reads 15, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 813 ms.
The above results indicate that physical reads were performed on the database files to retrieve the result of the query. Now let’s execute the same query one more time:
select * from AdventureWorks.Sales.SalesOrderDetail d
inner join AdventureWorks.Sales.SalesOrderHeader h
on d.SalesOrderID=h.SalesOrderID
where customerid=676;
The io and time statistics gives us the following output:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 1 ms.
(359 row(s) affected)
Table 'SalesOrderDetail'. Scan count 12, logical reads 41, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 38, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 207 ms.
The second execution of the query didn’t need any physical reads because after the first execution, the required pages are already cached in the buffer pool.
Querying the Database Snapshot
Now, let’s execute the same query on the database snapshot AdventureWorksSnapshot, and observe the io and time statistics.
select * from AdventureWorksSnapshot.Sales.SalesOrderDetail d
inner join AdventureWorksSnapshot.Sales.SalesOrderHeader h
on d.SalesOrderID=h.SalesOrderID
where customerid=676;
The io and time statistics gives us the following output:
SQL Server parse and compile time:
CPU time = 31 ms, elapsed time = 398 ms.
(359 row(s) affected)
Table 'SalesOrderDetail'. Scan count 12, logical reads 41, physical reads 22, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 38, physical reads 16, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 300 ms.
Note that the first query on the database snapshot required physical reads. Since no data in the source database was changed after the database snapshot was created, the sparse file is empty, and therefore, the data is retrieved from the source database. And, since, the required pages of the source database are already cached in the buffer pool, the customer expected that there is no need to perform any physical reads. But there were physical reads, why?
The Mystery Resolved
A peek into the internals of how database snapshots work will reveal the mystery.
When a database snapshot is created, an in-memory bitmap is maintained to indicate whether a page has been changed in the source database or not. If the page is changed, then the original page has been copied to the database snapshot’s sparse file. If the page is not changed, the original page is still there in the source database. While executing a query, this bitmap is referenced to find out whether a particular page is to be retrieved from the database snapshot’s sparse file or from the source database’s data file.
The database snapshot is just another database from the SQL Server engine point of view. Each database maintains its own cache in the buffer pool. Data pages cached in the buffer pool for one database cannot be copied or transferred to the cache managed by another database (or database snapshot) in the same SQL Server instance. Therefore, when we execute a query the first time on the database snapshot, the database snapshot has no pages in the buffer pool, and therefore decides to read from disk. Even though the source database has already read these pages into the buffer pool it still performs physical reads against the data files of the source database, and places the pages in the buffer pool associated with the database snapshot.
The number of data pages for each database read into the buffer pool is found by querying the sys.dm_os_buffer_descriptors.
select db_name(database_id) [DB Name], database_id, count(*) [cached pages count] from sys.dm_os_buffer_descriptors
where database_id=db_id('AdventureWorks')
or database_id=db_id('AdventureWorksSnapshot')
group by database_id
set statistics io
Comments
- Anonymous
June 07, 2007
The comment has been removed - Anonymous
June 08, 2007
I can imagine how DB007's suggestions would add complexity to the logic to such a level that bugs are introduced. However, I think that there's an easier,cleaner and safer way to optimize this kind of situation.
- Query executed in snapshot DB needs a page, so a logical I/O is initiated.
- Page is not in cache for snapshot DB, so a request for physical I/O is mage.
- Now, the new logic kicks in. The request for physical I/O is intercepted and a check is made to see if the same page happens to be in the cache of some other DB.
- If that is the case (and the page in cache is not dirty), the physical I/O is carried out by copying the 8K bytes of that page from Db1's cache to Db2's cache. Otherwise, the physical I/O is performed as it's currently done.