SQL Server Replication: Synchronize From Backup

Synchronize from Backup is the first of the "No Sync" scenarios and is typically the next scenario people look at when automatic synchronization becomes impractical

Synchronize From Backup

This method cannot be done from the GUI and must be done through script. The basic process involves making a backup of the publisher database, restoring it to the subscriber, then create the subscription specifying the backup file. SQL will read the Log Sequence Number (lsn) from the header and apply all transactions that have occurred since that point.

Advantages

  • Much faster than automatic for large databases
  • Much faster than automatic for long distances
  • No blocking on the publisher database

Disadvantages

  • Restoring the database overwrites the current database on the subscriber
  • Must manually correct any incompatible features on the subscriber
  • Database permissions are replaces with the publisher permissions
  • Orphaned users

Best Used When

  • Database is very large
  • Publisher and subscriber are geographically distant over a WAN
  • Most or all of the database is published
  • Publisher schema works as a subscriber schema with little modification

Notes

  1. Must use a publisher backup: One of the caveats to sync with backup is that the backup must be from the publisher database. You cannot take a backup from another subscriber and sync with it, though there are some tricks to make it work.

  2. Database overwritten: Since the entire database is overwritten, all objects in it are replaced with whatever was on the publisher database. The usual fix for this is to script everything on the subscriber that is needed and replace them after restoring the backup. Some items include:

    1. Permissions
    2. Roles
    3. Users
    4. Objects (functions, stored procedures, etc)
    5. Indexes
  3. Orphaned SQL Users: Any SQL logins created with a different SID won't map to user in the database when restored, and you'll see errors like this:

    Error: 18456, Severity: 14, State: 38. Login failed for user 'foo'.
    Reason: Failed to open the explicitly specified database. [CLIENT: x.x.x.x] 

    The ways to fix these is to recreate the login using the SID from the publisher, or remap the user to the login using either sp_change_users_login or ALTER USER

  4. Not For Replication Option: As a general rule all objects in the publisher database that can have the 'NOT FOR REPLICATION' option should be created with it. What the option does is ignore the specific feature if a replication agent is doing the work. There are four of these:

    1. Foreign Keys: The constraint was already checked on the publisher and does not need to be rechecked on an identical subscriber.
    2. Check Constraints: Same reason as foreign key
    3. Identity Columns: The publisher will insert the identity value directly into the table to maintain the same values between both servers, so the replication agent would encounter an insert error if not set. One side effect of this is that the identity value will not update, so if inserts need to be made on the subscriber (peer-to-peer replication, for example) then update the identity seed using a range that won't conflict with the publisher using DBCC CHECKIDENT (RESEED)
    4. Triggers: Generally it's better to publish the base tables that a trigger modifies instead of letting it fire on the subscriber. Replicating both the trigger with 'for replication' and the base table will cause a conflict.
  5. Long Copy / Restore Time: With extremely large databases it can take several hours to copy it across the network and restore. The way to work around this problem is to do the initial restore with the NORECOVERY option, then create and apply a differential backup to catch up on recent transactions.

  6. Required transactions missing: When you try to create the subscription you get the message below.

Msg 21397
The transactions required for synchronizing the nosync subscription created from the specified backup are unavailable at the Distributor. Retry the operation again with a more up-to-date log, differential, or full database backup.

This is usually caused by not having a minimum retention time set on the distributor and the distribution cleanup agent removes the transactions since they have already been distributed to all existing subscribers. The solution is to disable the distribution cleanup agent before taking the backup. 

If you already see the message it's too late and the best way out is to restore the database again with NORECOVERY, then disable the distribution cleanup agent, and finally backup a differential and restore it. Use the differential to synchronize.

 
/*===============================================================================================*/
/*Add Transactional snapshot subscription */
/*Restore and synchronize using backup */
/*SQL 2005 or SQL 2008 */
/* */
/*Use Ctrl-Shift-<M> to replace template parameters */

/*=============================================================================================================================*/
--1. Run on Distributor: <Distributor, varchar, DistServer>
--Stop distribution cleanup agent
--
--You'll see an error if the job isn't running, it's normal
USE <Distribution_DB, varchar, distribution>
GO
EXEC msdb..sp_update_job @job_name = 'Distribution clean up: <Distribution_DB, varchar, distribution>', @enabled = 0;
EXEC msdb..sp_stop_job @job_name = 'Distribution clean up: <Distribution_DB, varchar, distribution>'

/*===============================================================================================*/
--2. Run on Publisher: <Publisher, varchar, PubServer>
--In the publisher DB: <PublisherDB, varchar, PubDB>
--
--Steps 1 and 2 manually delete the subscription from the publisher and subscriber.
--If you drop the subscription through the GUI, and select the option to connect to
--the subscriber and delete the record, you can skip steps 1 and 2.
--This way you don't have to bother with opening the new query window for step 2

USE <PublisherDB, varchar, PubDB>
GO

--Need to set option: 'allow Initialization from backup' on the Publication
EXEC sp_changepublication
@publication = N'<Publication, varchar, PubName>',
@property = N'allow_initialize_from_backup',
@value = 'true'
--Drop the existing subscription if present
--
IF EXISTS (
SELECT 1
FROM syspublications sp
JOIN sysarticles sa ON (sp.pubid = sa.pubid)
JOIN syssubscriptions ss ON (sa.artid = ss.artid)
WHERE sp.Name = N'<Publication, varchar, PubName>'
AND ss.srvname = N'<NewSubscriber, varchar, NewSubServer>'
AND ss.dest_db = N'<NewSubDB, varchar, PubDB>'
)
EXEC sp_dropsubscription
@publication = N'<Publication, varchar, PubName>',
@subscriber = N'<NewSubscriber, varchar, NewSubServer>',
@destination_db = N'<NewSubDB, varchar, PubDB>',
@article='all'
--Note: Full backup only in this script.
--If you want to do a diff, I recommend doing your full backup
--and restore w/ NORECOVERY before running this script.
--then add , WITH DIFFERENTIAL to the below command.
--When the full is mostly restored on the subscriber, use this script to finish
--
BACKUP DATABASE <PublisherDB, varchar, PubDB>
TO DISK = '<Publisher_Backup_Path, varchar, E:\MSSQL\Backup>\<PublisherDB, varchar, PubDB>.BAK'
WITH STATS, INIT

/*=============================================================================================================================*/
--3. Run on New Subscriber: <NewSubscriber, varchar, NewSubServer>
--Copy the file to the new server
--Run this from a command prompt, and change the copy command depending on your OS version
--Lots of options here, especially if you decided to stripe backups, etc.
--
--***Run this from a command prompt. Change the copy command depending on your OS version and what utilities you have available
ROBOCOPY "<Publisher_Remote_Backup_Path, varchar, <\\PubServer\E$\MSSQL\Backup>>" "<NewSubscriber_Backup_Path, varchar, E:\MSSQL\Backup>" "<PublisherDB, varchar, PubDB>.BAK"

--4. Run on New Subscriber: <NewSubscriber, varchar, NewSubServer>
--Restore the database
USE master
GO
IF EXISTS (SELECT 1 FROM master.sys.databases WHERE name = '<NewSubDB, varchar, PubDB>')
ALTER DATABASE <NewSubDB, varchar, PubDB>
SET OFFLINE WITH ROLLBACK IMMEDIATE
GO
RESTORE DATABASE <NewSubDB, varchar, PubDB>
FROM DISK = '<NewSubscriber_Backup_Path, varchar, E:\MSSQL\Backup>\<PublisherDB, varchar, PubDB>.BAK'
WITH STATS, REPLACE
GO

/*============================================================================*/
--5. Run on Publisher: <Publisher, varchar, PubServer>
--In the publisher DB: <PublisherDB, varchar, PubDB>
--Initialize the subscription with a backup file.
--The other thread from the subscriber should still be running to update the lsn
--As soon as the subscription is created
USE <PublisherDB, varchar, PubDB>
GO
exec sp_addsubscription
@publication = N'<Publication, varchar, PubName>',
@subscriber = N'<NewSubscriber, varchar, NewSubServer>',
@destination_db = N'<NewSubDB, varchar, PubDB>',
@subscription_type = N'Push',
@sync_type = N'initialize with backup',
@backupDeviceType = N'Disk',
@backupDeviceName = N'<NewSubscriber_Backup_Path, varchar, E:\MSSQL\Backup>\<PublisherDB, varchar, PubDB>.BAK',
@update_mode = N'read only',
@subscriber_type = 0
exec sp_addpushsubscription_agent
@publication = N'<Publication, varchar, PubName>',
@subscriber = N'<NewSubscriber, varchar, NewSubServer>',
@subscriber_db = N'<NewSubDB, varchar, PubDB>',
@job_login = null,
@job_password = null,
@subscriber_security_mode = 1,
@frequency_type = 64,
@frequency_interval = 1,
@frequency_relative_interval = 1,
@frequency_recurrence_factor = 0,
@frequency_subday = 4,
@frequency_subday_interval = 5,
@active_start_time_of_day = 0,
@active_end_time_of_day = 235959,
@active_start_date = 0,
@active_end_date = 0,
@dts_package_location = N'Distributor'

/*=============================================================================================================================*/
--6. After ALL other steps are completed, run these to enable to cleanup job:
--Start distribution cleanup agent
USE <Distribution_DB, varchar, distribution>
GO
EXEC msdb..sp_update_job @job_name = 'Distribution clean up: <Distribution_DB, varchar, distribution>', @enabled = 1;
EXEC msdb..sp_start_job @job_name = 'Distribution clean up: <Distribution_DB, varchar, distribution>'

Summary

Initialization from backup solves a lot of problems when trying to synchronize large databases but creates a few of its own. These problems can be mitigated with some planning and hopefully this template helps to simplify the process.