What is new: OpsMgr 2007 R2 - How to reset monitor state with recovery?
Cameron had a nice example of using new R2 feature process monitoring in real life, but that raised a question about feature he wanted to use. Following is report of his issue and how we can help him to address his challenges using already existing feature of OpsMgr 2007.
Scenario: Monitoring a system with a process monitor. Define a recovery to reboot the system if it’s not running the process required. Run this recovery automatically on critical state.
Problem: In OpsMgr prior to R2 when a Recovery was created it had an option to “Reset monitor” which would put it to a healthy state. In R2, this now says “Recalculate State Monitor”. This is representing a challenge as described below:
Challenge: Recalculating the state may keep the monitor in a critical state until the system has been rebooted successfully and is in fact running the process. If the process does not start correctly after reboot, it gets stuck in the critical state and the recovery will not run again. With a Reset of this monitor to a Healthy state, this would work properly, but without that option available I am not seeing an effective way to make this work.
Workaround: Recovery is no different than other workflows loaded by OpsMgr and is rather similar to task. It consists of modules that are chained together and should provide some corrective action in order for monitor to fix its state. For that reason, first module could be the module which resets state of the monitor.
Following is module that could be used with recovery directly. It will reset the state of the monitor specified in configuration.
<WriteActionModuleType ID="Microsoft.SystemCenter.Community.Health.ResetTargetStateAction" Accessibility="Public" Batching="false">
<Configuration>
<xsd:element minOccurs="1" name="MonitorId" type="xsd:string" />
</Configuration>
<OverrideableParameters>
<OverrideableParameter ID="MonitorId" Selector="$Config/MonitorId$" ParameterType="string" />
</OverrideableParameters>
<ModuleImplementation Isolation="Any">
<Composite>
<MemberModules>
<WriteAction ID="Health.ResetStateAction" TypeID="Microsoft.SystemCenter.Community.Health.ResetStateAction">
<ManagementGroupId>$Target/ManagementGroup/Id$</ManagementGroupId>
<ManagedEntityId>$Target/Id$</ManagedEntityId>
<MonitorId>$Config/MonitorId$</MonitorId>
</WriteAction>
</MemberModules>
<Composition>
<Node ID="Health.ResetStateAction" />
</Composition>
</Composite>
</ModuleImplementation>
<OutputType>System!System.BaseData</OutputType>
<InputType>System!System.BaseData</InputType>
</WriteActionModuleType>
Next is another module which can be used as well. It resets the state of the monitor first and then executes command.
<WriteActionModuleType ID="Microsoft.SystemCenter.Community.Health.ResetTargetStateCommandExecuterAction" Accessibility="Public" Batching="false">
<Configuration>
<IncludeSchemaTypes>
<SchemaType>System!System.CommandExecuterSchema</SchemaType>
</IncludeSchemaTypes>
<xsd:element minOccurs="1" name="ApplicationName" type="xsd:string" />
<xsd:element minOccurs="1" name="WorkingDirectory" type="xsd:string" />
<xsd:element minOccurs="1" name="CommandLine" type="xsd:string" />
<xsd:element minOccurs="1" name="TimeoutSeconds" type="xsd:integer" />
<xsd:element minOccurs="1" name="RequireOutput" type="xsd:boolean" />
<xsd:element minOccurs="1" name="MonitorId" type="xsd:string" />
</Configuration>
<ModuleImplementation Isolation="Any">
<Composite>
<MemberModules>
<WriteAction ID="Command" TypeID="System!System.CommandExecuter">
<ApplicationName>$Config/ApplicationName$</ApplicationName>
<WorkingDirectory>$Config/WorkingDirectory$</WorkingDirectory>
<CommandLine>$Config/CommandLine$</CommandLine>
<TimeoutSeconds>$Config/TimeoutSeconds$</TimeoutSeconds>
<RequireOutput>$Config/RequireOutput$</RequireOutput>
</WriteAction>
<WriteAction ID="Reset" TypeID="Microsoft.SystemCenter.Community.Health.ResetTargetStateAction">
<MonitorId>$Config/MonitorId$</MonitorId>
</WriteAction>
</MemberModules>
<Composition>
<Node ID="Command">
<Node ID="Reset" />
</Node>
</Composition>
</Composite>
</ModuleImplementation>
<OutputType>System!System.BaseData</OutputType>
<InputType>System!System.BaseData</InputType>
</WriteActionModuleType>
Sealed MP with both modules is attached to this post.
Sample: Attached is also example providing use of modules with simple event based monitor. Monitor targets instance of “Root Management Server” and that is a reason why management pack also defines a view for state of this entity. When you choose to display “Health explorer”, you should be easily able to locate sample monitor.
One of the recoveries present in attached MP runs automatically with WARNING state. Highlighted is MPElement replacement representing monitor you want to reset. (It should be same as value of the attribute Monitor! Also, please observe that using just a reset module causes its output to be displayed in “Context” tab as well as two state changes will appear to have “same” time of change in Health Explorer.
<Recovery ID="Microsoft.SystemCenter.Community.Monitors.RecoverySample.StateWarningResetRecovery" Accessibility="Internal" Enabled="onStandardMonitoring" Target="SC!Microsoft.SystemCenter.RootManagementServer" Monitor="Microsoft.SystemCenter.Community.Monitors.RecoverySample.EventBasedMonitor" RecalculateMonitor="false" ExecuteOnState="Warning" Remotable="true" Timeout="300">
<Category>Maintenance</Category>
<WriteAction ID="Reset" TypeID="MicrosoftSystemCenterCommunityMonitorsExtensions!Microsoft.SystemCenter.Community.Health.ResetTargetStateAction">
<MonitorId>$MPElement[Name="Microsoft.SystemCenter.Community.Monitors.RecoverySample.EventBasedMonitor"]$</MonitorId>
</WriteAction>
</Recovery>
DISCLAIMER:
Please evaluate in your test environment first! As expected, this solution is provided AS-IS, with no warranties and confers no rights. Use is subject to the terms specified at Microsoft.
Microsoft.SystemCenter.Community.Monitors.MPs.zip
Comments
Anonymous
April 20, 2009
PingBack from http://microsoft-sharepoint.simplynetdev.com/what-is-new-opsmgr-2007-r2-how-to-reset-monitor-state-with-recovery/Anonymous
January 29, 2010
I was looking at this monitor and the connector based one you wrote http://blogs.msdn.com/mariussutara/archive/2009/02/02/how-to-reset-monitor-when-closing-alert.aspx. I was wondering if the connector based one is still applicable with R2 and if it is what the difference or what situation would you use the different ones. My problem is that certain monitors (for example the diskspace or service down monitors) Will not re-alert if the service is not started or if the diskspace is still critical. I wanted to see if it is possible to reset the health of a monitor programmatically so that it will send out a notification. Specifically I would like to create a monitor that I can target specific monitors and reset their state.