แชร์ผ่าน


About the transaction-level snapshot "start" time (for statement reconciliation)

A few days ago I received an email from my good friend Kimberly telling me about an interesting discussion she was having with the attendees of one week of training sessions herself and other members from the SQL Skills Superb Gang were delivering in the UK to a group of SQL Server professionals who are seeking to attain the MCM certification and extend our ever growing family.

This is what Kimberly described:

With transaction-level snapshot isolation (allow_snapshot_isolation) the point in time to which the statements of the transaction reconcile is NOT the BEGIN TRANSACTION statement but instead the first statement that performs data access.

Kimberly already documented that behavior back in 2006, inside the technical paper she wrote for the product group called “SQL Server 2005 Row Versioning-Based Transaction Isolation” which is available at http://msdn.microsoft.com/en-us/library/ms345124(v=sql.90).aspx.

In that paper she commented it as follows:

SET TRANSACTION ISOLATION LEVEL SNAPSHOT;

BEGIN TRAN;

SELECT getdate(); -- (T1) transaction has not "officially begun"

SELECT * FROM <tablename>; -- (T2) transaction t has begun

SELECT... -- will see all committed changes as of (t)

SELECT... -- will see all committed changes as of (t)

COMMIT TRAN;

With all that in mind, her question was this:

Can we determine that point in time (with an actual date/time value) that the statements reconcile? For example, what if I want to write something like – these reports are accurate as of XYZ time. I can certainly do a SYSDATETIME() right before the first data access statement; however, that’s not precise. I could grab the transaction starttime but that’s not accurate and even worse it’s tickcount based. I could use sysdatetime() after the begin and immediately before the first statement that does data access and I’m certainly closer. But, there are still [potentially] other statements that might be after that time and before our transaction. Correct, again, but I’m just looking for precision.

While trying to answer the question herself, she built a repro and attempted to retrieve that information from every single corner she thought it could be stored. Following is her repro accompanied with annotations that briefly reveal her thoughts and conclusions throughout the journey.

So, here’s my repro:

Database Credit

ALLOW_SNAPSHOT_ISOLATION ON

Session 1

SET TRANSACTION ISOLATION LEVEL SNAPSHOT

BEGIN TRAN;

SELECT SYSDATETIME(); -- (T1) transaction has not officially begun

WAITFOR DELAY '00:00:10.000' -- (T1) transaction has not officially begun

-- WHILE this is waiting I modify some of the rows in member from another session:

UPDATE member

    SET firstname = convert(varchar(15), GETDATE(), 114)

        , lastname = 'transaction'

WHERE member_no < 100

    AND member_no %2 = 0

SELECT * FROM member where member_no between 1 and 100;

-- Of course, this transaction sees the changes (because neither the sysdatetime or the waitfor delay “started” the transaction but now THIS is the start for reconciliation). The problem though is that it’s a select and therefore the transaction is not active (no rows in fn_dblog()) so I can’t get an “active” starting point. Ultimately, what I want to be able to do is return the time to which the reporting transaction reconciles (with precision). I’m just not sure it’s possible!

I’ve checked:

select * from sys.dm_tran_active_snapshot_database_transactions

                (this gives me my transaction id)

select * from sys.dm_tran_active_transactions

                (this gives me my BEGIN TRAN time… in timeticks so it’s not ideal anyway)

select * from sys.dm_tran_active_database_transactions

                (this gives me nothing because this isn’t an active transaction)

SELECT * FROM fn_dblog (NULL, NULL) where SPID = @@SPID

                (this transaction doesn’t even have any log rows…)

To be honest, I was not familiarized at all with the internals of this part of the product. In fact, I’ve rarely used some of the DMVs Kimberly mentioned in her repro. It required me to spend some time peeping into that part of the source code of the product to try to get an understanding of the design and implementation of how SQL manages all this row versioning stuff.

Kimberly explicitly asked me to not spend a single minute working on this if nothing came to my mind immediately. That was very kind of her, and I appreciate it. She probably did so because she knows me pretty well already, and knows that I can hardly sleep if I’m not able to understand something to the point I’m able to explain it. Smile

Unfortunately, that was the case (i.e. nothing came to mind instantly Smile). So, it took me a while to understand the whole thing and here’s what I’ve found:

Every user transaction (which is represented internally as an object) has another object attached to it (it’s the XVB or transaction version information block) which is used to track the “level of dependency” of this transaction with the row version manager.

What is internally referred as a timestamp has no correlation at all to the clock time. It consists of a 64bit unsigned integer starting at one, which is global to the instance of SQL Server and managed by a singleton class (XTSMgr) representing the Transaction Timestamp Manager. Therefore, this number we refer to as timestamp is of minimum value to calculate the time Kimberly is interested in.

As a side note, let me tell you a bit about the XTSMgr. The Transaction Timestamp Manager is responsible for giving out monotonically increasing timestamp for transactions. The timestamps are used to establish ordering of write and read transactions for versioning. Timestamp is reset to 1 after server crashes or shutdowns. First timestamp value is 1. Rough estimation is that with an 8-byte timestamp, it would be sufficient for 584.942 years if transaction rate would be 1 million per second. In the practically impossible event that server would wrap the timestamp around, the server would shut down.

However, every time you run,  in the context of a transaction , a statement which needs the commitment from version store manager to keep around the version of rows stored in whatever objects that statement references until the transaction finishes, the following method (XVB::Enqueue) is invoked. The first time that happens in any given transaction, XVB has no timestamp (i.e. HasTimeStamp() returns FALSE), so m_startTickCount is populated with current tick count. That is the precise time Kimberly was asking for.

XVB::Enqueue ()

{

     if (HasTimeStamp ())

     {

           return; <-- If this XVB already has a timestamp stored in it, it just returns. So, for example, m_startTickCount won’t be overwritten with a more recent value. The time when the first statement in this transaction caused the transaction to be “controlled” by the transaction version manager is the time that remains in m_startTickCount until the transaction completes (commits/rollbacks).

     }

     m_startTickCount = GetTickCount ();

.

.

.

}

Now, the only way we can retrieve the value stored in m_startTickCount is using the elapsed_time_seconds column exposed by sys.dm_tran_active_snapshot_database_transactions. Unfortunately though, its precision is truncated to the seconds.

The following query, run in the scope of a snapshot transaction, would retrieve that value for you:

select dateadd (s, -elapsed_time_seconds, getdate ()) as tx_snapshot_time

from sys.dm_tran_active_snapshot_database_transactions ast join sys.dm_tran_current_transaction ct

on ast.transaction_sequence_num = ct. transaction_sequence_num

If you need further precision than the second, that’s not possible to obtain without peeping into SQL’s VAS (i.e. live debugging or memory dump analysis).

And the reason why precision is not so important in that value is because it is not used for reconciliation. Only the monotonically increasing discrete number (what I referred before as the timestamp) is what is used for reconciliation.

The clock time value is only exposed via the elapsed_time_seconds column of the DMV or the “Longest Transaction Running Time” performance counter of the “Transactions” performance object. Expressing the elapsed time in seconds, was considered an accurate enough unit so that the user has a very approximate idea of the amount of time each transaction has been running for.