SQL Azure Data Sync - Synchronization Loops
A synchronization loop results from an overlapping of databases in two or more sync groups such that a change in a database in one sync group is re-written to the same database by another sync group. Synchronization loops are self perpetuating and can result in large amounts of data repeatedly overwriting identical data in two of more databases. This can significantly degrade performance and increase costs as you pay for unnecessary data movement.
Important! This wiki topic may be obsolete. This wiki topic is no longer updated by Microsoft. We moved the content to the MSDN Library where we keep it current. To find the latest version of this topic on MSDN click here. |
This article discusses the causes of synchronization loops and how to prevent them.
Sync Group Design
When you design your synchronization topologies it is perfectly fine to have a single database in multiple sync groups. Figure 1 is an illustration of this. Even though database A is shared by Sync Group 1 and Sync Group 2 there is no feedback loop that perpetuates the same data being written and re-written to the same database. It is also possible to share more than one database between two sync groups and not have a loop. We will discuss those configurations later in this topic.
How Figure 1 works
Let's walk through the sequence of events that take place when a change is made to any database in Figure 1. We'll assume the change is made to database C here.
Row 1 in database C is changed.
In database C's metadata a record is created indicating that row 1 was changed and where the change originated.
Sync Group 1 synchronizes.
- It notes that row 1 in database C is changed by somone other than SG1.
- Row 1 from Datbase C is written to SG1 Hub.
- A record is created in SG1's metadata indicating that row 1 was changed by SG1.
- Row 1 from SG1 is written to database A.
- A record is created in database A's metadata indicating that row 1 was changed by SG1.
Sync Group 1 is finished synchronizing.
Sync Group 2 Synchronizes.
- It notes that row 1 in database A is changed by somone other than SG2.
- Row 1 from Datbase A is written to SG2 Hub.
- A record is created in SG2's metadata indicating that row 1 was changed by SG2.
- Row 1 from SG2 is written to database D.
- A record is created in database D's metadata indicating that row 1 was changed by SG2.
Sync Group 2 is finished synchronizing.
If no new changes are made to any of the databases then the next time they sync no data is written to any of the databases.
Figure 1: Two sync groups sharing a database. No synchronization loop
How Figure 2 works (or doesn't work)
Let's walk through the sequence of events that take place when a change is made to any database in Figure 2. Again we'll assume the change is made to database C.
Row 1 in database C is changed.
In database C's metadata a record is created indicating that row 1 was changed and where the change originated.
Sync Group 1 synchronizes.
- It notes that row 1 in database C is changed by somone other than SG1.
- Row 1 from Datbase C is written to SG1 Hub.
- A record is created in SG1's metadata indicating that row 1 was changed by SG1.
- Row 1 from SG1 is written to databases A and B.
- A record is created in database A and B's metadata indicating that row 1 was changed by SG1.
Sync Group 1 is finished synchronizing.
Sync Group 2 Synchronizes.
- It notes that row 1 in database A (or B) is changed by somone other than SG2.
- Row 1 from Datbase A is written to SG2 Hub.
- A record is created in SG2's metadata indicating that row 1 was changed by SG2.
- Row 1 from SG2 is written to databases B (or A) and D.
- A record is created in database B (or A) and D's metadata indicating that row 1 was changed by SG2.
Sync Group 2 is finished synchronizing.
If no new changes are made to any of the databases the next time that SG1 synchronizes it notes a change in database A (or B) that it didn't make (it was made by SG2 in the last cycle). Even though the data is no different from what is in all the databases in both sync groups it is treated as changed data and the above synchronization cycle is run again, and again, and again without end.
It is worth noting that while our synchronization loop example involves only a single row, what is true for one row is true for any number of rows. Imagine a million row database where 1% of the unchanged rows are changed in each synchronization cycle. After one cycle you are infinitely looping 10,000 rows. After two cycles you are looping that 10,000 plus an addiitonal 9,900 rows for a total of 19,900 rows. And the number of rows in the loop continues to increase, never decrease.
Figure 2: Two sync groups with a synchronization loop
How Figure 3 works (big loops are still loops)
Figure 3 illustrates a more complex situation where the loop may not be as easy to identify as in Figure 2.
In Figure 3 there are three sync groups. Each sync group has a hub and two other databases. Each database is shared by two sync groups but no pair of sync groups share the same two databases.
- SG1: A and B
- SG2: B and C
- SG3: C and A
Let's walk through the sequence of events that take place when a change is made to any database in Figure 3. We'll assume the change is made to database A.
Row 1 in database A is changed.
In database A's metadata a record is created indicating that row 1 was changed and where the change originated.
Sync Group 1 synchronizes.
- It notes that row 1 in database A is changed by somone other than SG1.
- Row 1 from Datbase A is written to SG1 Hub.
- A record is created in SG1's metadata indicating that row 1 was changed by SG1.
- Row 1 from SG1 is written to database B.
- A record is created in database B's metadata indicating that row 1 was changed by SG1.
Sync Group 1 is finished synchronizing.
Sync Group 2 Synchronizes.
- It notes that row 1 in database B is changed by somone other than SG2.
- Row 1 from Datbase B is written to SG2 Hub.
- A record is created in SG2's metadata indicating that row 1 was changed by SG2.
- Row 1 from SG2 is written to database C.
- A record is created in database C's metadata indicating that row 1 was changed by SG2.
Sync Group 2 is finished synchronizing.
Sync Group 3 Synchronizes.
- It notes that row 1 in database C is changed by somone other than SG3.
- Row 1 from Datbase C is written to SG3 Hub.
- A record is created in SG3's metadata indicating that row 1 was changed by SG3.
- Row 1 from SG3 is written to database A.
- A record is created in database A's metadata indicating that row 1 was changed by SG3.
Sync Group 3 is finished synchronizing.
If no new changes are made in any of the databases the next time SG1 synchronizes it notes that row 1 in database A was changed by someone other than SG1. This is treated by SG1 as a completely different change even though the data is identical to that in all the databases and hubs. Thus the entire synchronization cycle is repeated infinitely.
Figure 3: Three sync groups with a synchronization loop
How to avoid synchronization loops
The key to avoiding synchronization loops is to avoid having any path whereby a data change in one sync group is ultimately written back to the originating database by another sync group. There are three ways to accomplish this.
Design your synchronization system so the loops cannot exist.
For example:
In Figure 2 you could remove either database A or B from the shared area.
In Figure 3 you could remove any database from any sync group and you would break the loop.Use row filtering.
If you create mutually exclusive filters when you configure each sync group then no two sync groups will synchronize the same data.
For example:
In Figure 2 you could add the row filter Region = "NA" to one sync group and Region = "EU" to the other.Use synchronization direction
For example:
If you set the sync directions as illustrated in Figure 4 you no longer have a loop.
Feedback
This release was provided in order to gather feedback from our customers. Now that you have previewed what the SQL Azure Data Sync team is doing, please let us know what you think of our direction, and tell us about your experiences.
- Send us a note on http://twitter.com/syncfx.
- Post a comment to our blog at http://blogs.msdn.com/sync.
- Post a thread to our forum at http://social.msdn.microsoft.com/forums/en-US/ssdsgetstarted/threads/
See Also
- [[SQL Azure Data Sync Overview]]
- [[SQL Azure Overview]]