Issue with the Tune Management Packs view in SCOM 2016
With SCOM 2016 we introduced a very cool and helpful new feature: the data driven Alert Management, provided by the “Tune Management Packs” view. You can find more details about this feature here .
Black box and empty lines in Tune Management Packs view
Unfortunately, under certain circumstances there is an issue with this view. When you access the view, you might see a moving black box and some empty lines, that are moving when you scroll down the list of Management Packs to tune:
When you double click e.g. on the blank line you get a .NET exception, complaining about a missing Management Pack like this:
After closing the exception window, the whole console might crash:
As far as I have tested it, all SCOM 2016 versions up to UR3 are affected by this issue when you have a situation as described below.
Cause
The query behind the "Tune Management Pack" view queries (amongst other things) the OpsMgr data warehouse. Unfortunately, this query (or the rendering of the result) doesn’t handle a quite common scenario very well:
If you delete a Management Pack from your Management Group and the workflows of this Management Pack created alerts that are still available in the data warehouse, the view will show the described behavior.
Given the default retention period of 180 days for raw alert data in the OpsMgr data warehouse it is highly likely, that you are or will be impacted by this issue, once you delete a Management Pack from your Management Group.
In my case, it was a community MP that the customer had deleted from the Management Group. You could comprehend this by searching the “ManagementPackHistory” table in the OpsMgr database for the Management Pack ID given in the .NET exception:
Workaround
If you don’t have any alerts from deleted Management Packs left in your OpsMgr data warehouse, the view will immediately return to normal operations and will work as expected.
Unfortunately you cannot just delete the alerts causing the issue from the data warehouse. You have to use our built in grooming feature to remove these alerts.
If you are not familiar with grooming in Operations Manager, please read these two excellent posts from Kevin Holman here and here.
So currently you have essentially two options to work around this issue:
Wait for the default grooming to remove all affected alerts
Just for veeery patient one’s and probably not a good solution if you have long alert data retention time…
Advantage of this option: You will not lose any alert data from your data warehouse.
Reduce the current alert retention time
If you reduce the current alert retention time below the amount of days after the removal of the Management Pack and wait one day for the standard grooming to kicks in, you should be fine. What does that mean?
In my example, the customer removed the Management Pack 30 days ago. So you have to set the retention time to a value < 30 days. But of course, you can always use the "sledge hammer" and set the retention time to 1 day.
You can modify the retention time with the tool dwdatarp.exe like Kevin explained in the link above.
dwdatarp.exe -s OMDW\i01 -d OperationsManagerDW -ds "Alert data set" -a "Raw data" -m 1
Advantage of this option: You don’t have to wait more than a day to get the view up and running again.
Disadvantage: You will lose all alert information from the data ware house.
Once the view is working again, you can increase the retention time back to its original value!
So far, these are the only workarounds for this issue known to me. If there will be a general fix for this issue in a future update rollup, I will let you know by updating this post.