How to deal with Scheduled Maintenance Mode and SQL Always On
UPDATE: November 2018 the 9th – For SCOM 2016 only this issue has been fixed in UR6
Hi everyone,
Bruno here. Today I will cover a topic that came to my attention several times, without being previously discussed broadly and deeply.
The topic:
Have you ever asked yourself what happen when you create a Scheduled Maintenance mode entry in your System Center 2016 - Operations Manager and above with its database hosted on SQL Server Always ON? Have you ever checked if a given Scheduled Maintenance Mode run after the Availability Group was moved over to a secondary replica? How can I make sure that it works as expected?
The issue:
When using SQL Server Always On, all the changes occurring on databases which are part of the Availability Group are replicated to secondary replica(s). This is obviously not true for system databases such as (Master, Model, MSDB and TempDB) which can't be part of any Availability Group by design and hence cannot replicate.
To better explain the scenario, let's think about the new SQL Login creation. You know for sure that, as part of the Always On configuration, you need to replicate the Logins which are stored on the Master database, from the primary replica to all secondary replicas which belong to a given Availability Group (see Logins and Jobs for Availability Group Databases).
Going back to our topic, when you create a Scheduled Maintenance Mode entry in System Center Operations Manager, the operation takes places into 2 different places: The OperationsManager database and the MSDB database. As I said before, the Operations Manager gets replicated over the secondary replicas, whilst the MSDB gets not.
Given that, as you can imagine, the Scheduled Maintenance Mode will not work anymore until you switch the Availability Group back to the original node.
The prove and repro:
For those of you which are not yet convinced, you can simple repro the issue doing the following:
In the Operations Console, create a new scheduled MM entry in SCOM if don't have any.
On the primary replica (the active instance), using Microsoft SQL Server management Studio, run the following query against the OperationsManager database to see Scheduled Maintenance mode entries and take note of (at least) one ID from the ScheduleId column
SELECT*FROM [OperationsManager].[dbo].[MaintenanceModeSchedule]
On the primary replica (the active instance), using Microsoft SQL Server management Studio, run the following query against the MSDB and take note of the result:
SELECT *FROM [msdb].dbo.sysjobs WHERE [name] ='<<Value_of_ScheduledId_field_from_previous_query>>'
On one of the secondary replicas (the passive instances), run the same as step 3. You will see no lines returned (as expected).
The solution:
The solution to this issue is simple, but because of the problem's nature, cannot be considered a one-time solution. In fact, you need to apply it each and every time you create a new Scheduled Maintenance Mode.
But let's get this done now :
- On the primary replica, open Microsoft SQL Server Management Studio and connect to it (you can use the listener name or the instance name as you prefer)
- Expand SQL Server Agent
à
Jobs - Find the job named as the ScheduledId you got from step 3 in the previous section, right-click on it and select Script Jos as à
CREATE To à
File
- Give the file a name and click Save
- Now, connect to one of the secondary replica
- Expand Databases à
System Databases and select the MSDB database - Open the file you saved in step 4 and click Execute (make sure you selected the MSDB database and that its name is appearing in the box on the left, right above Object Explorer)
- Right-click on Jobs and select Refresh. You will see the newly created job appearing.
- Repeat the above steps for any affected job on any Secondary replica
- If you want, you can failover the Availability Group over one of the secondary and run the Scheduled Maintenance Mode.
Of course, this is just a quick and dirty workaround. I am pretty sure that working with your DBA you can find a way to make the Job definition aligned between replicas, automatically
Thank you all; I really hope that this post will save your time as well as you from headaches