How to: Use a Custom Change Tracking System

Article
12/03/2008

Many applications require that changes be tracked in the server database so that these changes can be delivered to clients during a subsequent synchronization session. This topic describes the requirements for a change tracking system, and shows you how to create a custom system that Microsoft Synchronization Services for ADO.NET can use. Custom change tracking is appropriate in some cases. However, be aware that it does introduce complexity and can affect server database performance. If you are using SQL Server 2008, we recommend that you use the SQL Server change tracking feature. For more information, see How to: Use SQL Server Change Tracking.

Server Requirements for Synchronization Scenarios

Microsoft Synchronization Services for ADO.NET is designed to have minimal effect on the server database. Therefore, modifications that are required for change tracking in the server database are proportional to the level of functionality that you want in an application. Keep in mind the following considerations:

On one end of the spectrum, there is a download-only snapshot of data. This requires no changes.
On the other end, there is bidirectional synchronization that uses full change tracking and conflict detection.

The following table summarizes the ways in which you can use Synchronization Services, and identifies the corresponding requirements for the server database.

Scenario	Primary key or unique column1	Track update time	Track insert time	Track delete time	Track client ID for updates	Track client ID for inserts	Track client ID for deletes
Download a snapshot of data to the client.	No	No	No	No	No	No	No
Download incremental inserts and updates to the client.	Yes	Yes	Yes2	No	No	No	No
Download incremental inserts, updates, and deletes to the client.	Yes	Yes	Yes2	Yes	No	No	No
Upload inserts to the server.	Yes	No	No	No	No	No3	No
Upload inserts and updates to the server.	Yes	No	No	No	No3	No3	No
Upload inserts, updates, and deletes to the server.	Yes	No	No	No	No3	No3	No3
Bidirectional inserts and updates with conflict detection.	Yes	Yes	Yes2	No	Yes4	Yes4	No
Bidirectional inserts, updates, and deletes with conflict detection.	Yes	Yes	Yes2	Yes	Yes4	Yes4	Yes4

1 Primary keys must be unique across all nodes, and must not be reused: If a row is deleted, the primary key of that row should not be used for another row. Identity columns are typically not an appropriate choice for distributed environments. For more information about primary keys, see Selecting an Appropriate Primary Key for a Distributed Environment (Synchronization Services).

2 Required if you want to distinguish between inserts and updates. For more information, see "Determining Which Data Changes to Download to a Client" later in this topic.

3 Required if more than one client might change a row and you want to identify which client made the change. For more information, see "Identifying Which Client Made a Data Change" in this topic.

4 Required if you do not want to echo changes back to the client that made them. For more information, see "Identifying Which Client Made a Data Change" in this topic.

Note

In addition to the changes that are described earlier, you will likely create stored procedures for data access. Most of the examples in this documentation use inline SQL because it is easier to show what is occurring in the code. In production applications, stored procedures should be used for the following reasons: they encapsulate code, typically perform better, and can provide increased security over inline SQL if they are written correctly.

Determining Which Data Changes to Download to a Client

In download-only and bidirectional synchronization, you must track changes on the server so that Synchronization Services can determine which changes to download to clients. Although Synchronization Services does not define specifically how to support change tracking, there is a common way to approach it. For each table to be synchronized, you can take the following approach:

Add a column that tracks when a row was inserted in the server database.
Add a column, and in some cases a trigger, that tracks when a row was last updated in the server database.
Add a tombstone table and a trigger that tracks when a row was deleted from the server database. If you do not want to delete data from the server but have to send deletes to the client, logical deletes can be tracked in the base table: Use a column, typically of type bit, to indicate that a row is deleted and another column to track when the delete occurred.

These columns and tombstone tables are used together with anchors to determine which inserts, updates, and deletes to download. An anchor is just a point in time that is used to define a set of changes to synchronize. Consider the following queries:

The query that you specify for the SelectIncrementalInsertsCommand property. This query downloads incremental inserts from the Sales.Customer table in the Synchronization Services sample database, as follows:
```
SELECT CustomerId, CustomerName, SalesPerson, CustomerType FROM
Sales.Customer WHERE InsertTimestamp > @sync_last_received_anchor
AND InsertTimestamp <= @sync_new_received_anchor
```
For more information about this property and other properties that are related to synchronization commands, see How to: Specify Snapshot, Download, Upload, and Bidirectional Synchronization.
The query that you specify for the SelectNewAnchorCommand property. This query retrieves a point-in-time value. The InsertTimestamp column stores timestamp values. Therefore, the query uses the Transact-SQL MIN_ACTIVE_ROWVERSION function, introduced in SQL Server 2005 Service Pack 2, to retrieve a timestamp value from the server database, as follows:
```
SELECT @sync_new_received_anchor = MIN_ACTIVE_ROWVERSION - 1
```
MIN_ACTIVE_ROWVERSION returns the lowest active timestamp (also known as rowversion) value in the current database. A timestamp value is active if it is used in a transaction that has not yet been committed. If there are no active values in the database, MIN_ACTIVE_ROWVERSION returns the same value as @@DBTS + 1. MIN_ACTIVE_ROWVERSION is useful for scenarios such as data synchronization that use timestamp values to group sets of changes together. If an application uses @@DBTS in its anchor commands instead of MIN_ACTIVE_ROWVERSION, it can miss changes that are active when synchronization occurs.

When the Sales.Customer table is first synchronized, the following process occurs:

The new anchor command is executed. The command returns the value 0x0000000000000D49. This value is stored in the client database. The table has never been synchronized. Therefore, there is no anchor value that was stored in the client database from a previous synchronization. In this case, Synchronization Services uses the lowest value that is available for the SQL Server timestamp data type: 0x0000000000000000. The query that Synchronization Services executes is as follows. This query downloads the schema and all rows from the table.
```
exec sp_executesql N'SELECT CustomerId, CustomerName, SalesPerson,
CustomerType FROM Sales.Customer WHERE (InsertTimestamp >
@sync_last_received_anchor AND InsertTimestamp <=
@sync_new_received_anchor)',N'@sync_last_received_anchor timestamp,
@sync_new_received_anchor timestamp',
@sync_last_received_anchor=0x0000000000000000,
@sync_new_received_anchor=0x0000000000000D49
```
During the second synchronization, the new anchor command is executed. Rows have been inserted since the last synchronization. Therefore, the command returns the value 0x0000000000000D4C. The table has been synchronized before; therefore, Synchronization Services can retrieve the anchor value 0x0000000000000D49. This value is stored in the client database from the previous synchronization. The query that is executed is as follows. The query downloads only those rows from the table that were inserted between the two anchor values.
```
exec sp_executesql N'SELECT CustomerId, CustomerName, SalesPerson,
CustomerType FROM Sales.Customer WHERE (InsertTimestamp >
@sync_last_received_anchor AND InsertTimestamp <=
@sync_new_received_anchor)', N'@sync_last_received_anchor timestamp,
@sync_new_received_anchor timestamp',
@sync_last_received_anchor=0x0000000000000D49,
@sync_new_received_anchor=0x0000000000000D4C
```

For examples of update and delete commands, see How to: Download Incremental Data Changes to a Client and How to: Exchange Bidirectional Incremental Data Changes Between a Client and Server.

As noted earlier, the command that is used to retrieve anchor values depends on the data type of the tracking columns in the server database. The examples in this documentation use SQL Server timestamp, also known as rowversion. To use a SQL Server datetime column, the query for new anchor command would be similar to the following:

SELECT @sync_new_received_anchor = GETUTCDATE()

To determine which data type to use for an anchor, you should weigh the application requirements and consider how much flexibility you have to change the server database schema. If the database is in development, you could specify exactly which columns and triggers to add. If the database is in production, your options might be more limited. Consider the following guidelines:

All tables in a synchronization group should use the same data type and new anchor command. If you can, use the same data type and command for all groups.
The datetime data type is easy to understand, and tables frequently already have a column that tracks when a row was modified. However, this data type can be problematic if clients are in different time zones. If you use this data type, transactions could be missed when incremental changes are selected.
The timestamp data type is precise and does not depend on clock time. However, each table in a SQL Server database can contain only one column of this data type. Therefore, if you must distinguish between inserts and updates, you can add a column of a different data type, such as binary(8), and store timestamp values in that column. For an example, see Setup Scripts for Synchronization Services How-to Topics. The timestamp data type could be an issue if the server database is restored from a backup. For more information, see Database Objects Supported by Synchronization Services. As noted earlier, we recommend that you use MIN_ACTIVE_ROWVERSION in the command that selects a new anchor.

Identifying Which Client Made a Data Change

There are two main reasons to identify which client made a data change:

To support conflict detection and resolution in upload-only and bidirectional synchronization.
If the server and client or more than one client can change a given row, you might want to identify who made the change. This information lets you write code, for example, that prioritizes one change over another. Without this information, the last change made to the row is persisted.
To prevent echoing of changes back to the client during bidirectional synchronization.
Synchronization Services first uploads changes to the server, and then downloads changes to the client. If you do not track the identity of the client that made a change, the change will be uploaded to the server and then downloaded back to the client during the same synchronization session. In some cases, this echoing of changes is acceptable, but in other cases it is not.

As with change tracking, Synchronization Services does not specifically define how to support identity tracking; however, there is a common way to approach it. For each table to be synchronized, you can take the following approach:

Add a column to the base table that tracks who made each insert.
Add a column to the base table that tracks who made each update.
Add a column to the tombstone table that tracks who made each delete.

These columns and tables are used with the ClientId property to determine which client made each insert, update, or delete. The first time any table is synchronized by using a method other than snapshot synchronization, Synchronization Services stores a GUID value on the client that identifies that client. This ID is passed to the DbServerSyncProvider so that it can be used by the select and update queries in each SyncAdapter. The ID value is available through the ClientId property. Consider the following Transact-SQL query:

SELECT CustomerId, CustomerName, SalesPerson, CustomerType FROM
Sales.Customer WHERE InsertTimestamp > @sync_last_received_anchor AND
InsertTimestamp <= @sync_new_received_anchor AND InsertId <>
@sync_client_id

This query is similar to the earlier query that tracks inserts that are made on the server. The statement in the WHERE clause ensures that the only inserts that are downloaded are those inserts that were not made by the client that is currently synchronizing.

Synchronization Services also enables applications to identify clients by using an integer at the server instead of a GUID value. For more information, see How to: Use Session Variables.

Server Preparation Examples

The following examples show how to set up the Sales.Customer table from the Synchronization Services sample database with the tracking infrastructure to handle the most complex application scenario: bidirectional insert, update, and delete operations with conflict detection. Less complex scenarios do not require the whole infrastructure. For more information, see "Server Requirements for Synchronization Scenarios" earlier in this topic. For a complete script that creates the objects in this example and additional objects, see Setup Scripts for Synchronization Services How-to Topics. For more information about how to use these objects, see How to: Specify Snapshot, Download, Upload, and Bidirectional Synchronization.

The examples in this section perform the following steps in preparing a server:

Verify the Sales.Customer schema. Determine whether the table has a primary key and any columns that could be used for change-tracking.
Add columns to track when and where inserts and updates are made.
Create a tombstone table, and add a trigger to the Sales.Customer table to populate the tombstone table.

Verifying the Sales.Customer Schema

The following code example shows the schema of the Sales.Customer table. The table has a primary key on the CustomerId column and has no columns that could be used for change tracking.

CREATE TABLE SyncSamplesDb.Sales.Customer(
    CustomerId uniqueidentifier NOT NULL PRIMARY KEY DEFAULT NEWID(), 
    CustomerName nvarchar(100) NOT NULL,
    SalesPerson nvarchar(100) NOT NULL,
    CustomerType nvarchar(100) NOT NULL)

Adding Columns to Track Insert and Update Operations

The following code example adds four columns: UpdateTimestamp, InsertTimestamp, UpdateId, and InsertId. The UpdateTimestamp column is a SQL Server timestamp column. This column is automatically updated when the row is updated. As it has been noted, a table can have only one timestamp column. Therefore, the InsertTimestamp column is a binary(8) column that has a default of @@DBTS + 1. The example adds to the value that is returned by @@DBTS so that the UpdateTimestamp and InsertTimestamp columns have the same value after an insert is performed. If this is not done, it will look as if each row was updated after it was inserted.

The ID that Synchronization Services creates for each client is a GUID; therefore, the two ID columns are uniqueidentifier columns. The columns have a default of 00000000-0000-0000-0000-000000000000. This value indicates that the server performed the update or insert. A later example includes a DeleteId column in the tombstone table.

ALTER TABLE SyncSamplesDb.Sales.Customer 
    ADD UpdateTimestamp timestamp
ALTER TABLE SyncSamplesDb.Sales.Customer 
    ADD InsertTimestamp binary(8) DEFAULT @@DBTS + 1
ALTER TABLE SyncSamplesDb.Sales.Customer 
    ADD UpdateId uniqueidentifier NOT NULL DEFAULT '00000000-0000-0000-0000-000000000000'
ALTER TABLE SyncSamplesDb.Sales.Customer 
    ADD InsertId uniqueidentifier NOT NULL DEFAULT '00000000-0000-0000-0000-000000000000'

Now that the columns have been added, the following example code adds indexes. These and other indexes in the example code are created on columns that are queried during synchronization. The indexes are added to emphasize that you should consider indexes when you implement change-tracking in the server database. Make sure that you balance server performance against synchronization performance.

CREATE NONCLUSTERED INDEX IX_Customer_UpdateTimestamp
ON Sales.Customer(UpdateTimestamp)

CREATE NONCLUSTERED INDEX IX_Customer_InsertTimestamp
ON Sales.Customer(InsertTimestamp)

CREATE NONCLUSTERED INDEX IX_Customer_UpdateId
ON Sales.Customer(UpdateId)

CREATE NONCLUSTERED INDEX IX_Customer_InsertId
ON Sales.Customer(InsertId)

Adding a Tombstone Table to Track Delete Operations

The following code example creates a tombstone table that has a clustered index and a trigger to populate the table. When a delete operation occurs in the Sales.Customer table, the trigger inserts a row in the Sales.Customer_Tombstone table. Before the trigger performs an insert operation, the trigger checks whether the Sales.Customer_Tombstone table already contains a row that has the primary key of a deleted row. This occurs when a row has been deleted from Sales.Customer, reinserted, and deleted again. If such a row is detected in Sales.Customer_Tombstone, the trigger deletes the row and reinserts it. The DeleteTimestamp column in Sales.Customer_Tombstone could also be updated.

CREATE TABLE SyncSamplesDb.Sales.Customer_Tombstone(
    CustomerId uniqueidentifier NOT NULL PRIMARY KEY NONCLUSTERED, 
    CustomerName nvarchar(100) NOT NULL,
    SalesPerson nvarchar(100) NOT NULL,
    CustomerType nvarchar(100) NOT NULL,
    DeleteId uniqueidentifier NOT NULL DEFAULT '00000000-0000-0000-0000-000000000000',
    DeleteTimestamp timestamp)

CREATE TRIGGER Customer_DeleteTrigger 
ON SyncSamplesDb.Sales.Customer FOR DELETE 
AS 
BEGIN 
    SET NOCOUNT ON
    DELETE FROM SyncSamplesDb.Sales.Customer_Tombstone 
        WHERE CustomerId IN (SELECT CustomerId FROM deleted)
    INSERT INTO SyncSamplesDb.Sales.Customer_Tombstone (CustomerId, CustomerName, SalesPerson, CustomerType) 
    SELECT CustomerId, CustomerName, SalesPerson, CustomerType FROM deleted
    SET NOCOUNT OFF
END

CREATE CLUSTERED INDEX IX_Customer_Tombstone_DeleteTimestamp
ON Sales.Customer_Tombstone(DeleteTimestamp)

CREATE NONCLUSTERED INDEX IX_Customer_Tombstone_DeleteId
ON Sales.Customer_Tombstone(DeleteId)

Partager via