Caching on the Brain
This week I have been working a lot on caching in the SDK, trying to optimize some code paths and improve performance as much as I can, so I decided to share a bit about how the cache works, and some insights into its implementation while it's fresh on my mind.
SCOM 2007 relies heavily on configuration data to function. Class and relationship type definitions become especially important when dealing with discovered objects. We found that it was very often common to want to move up and down these class hierarchies, which would prove very costly from a performance standpoint if each operation required a roundtrip to the server and database. We also recognized that not all applications require this type of functionality, and incurring the additional memory hit was not desired (this was especially true for modules that need the SDK). Given this, the SDK has been designed with 3 different cache modes: Configuration, ManagementPacks and None. The cache mode you want to use can be specified using the ManagementGroupConnectionSettings object.
First, let's go over what objects each cache mode will actually cache:
Configuration: ManagementPack, MonitoringClass, MonitoringRelationshipClass, MonitoringViewType, UnitMonitorType, ModuleType derived classes, MonitoringDataType, MonitoringPage and MonitoringOverrideableParameter
ManagementPacks: ManagementPack
None: None =)
For the first two modes there is also an event on ManagementGroup that will notify users of changes. OnTypeCacheRefresh is only fired in Configuration cache mode and indicates that something in the cache, other than ManagementPack objects changed. This means that the data in the cache is actually different. Many things can trigger a ManagementPack changing, but not all of them change anything other than the ManagementPack objects LastModified property (for instance, creating a new view, or renaming one). OnManagementPackCacheRefresh gets triggered when any ManagementPack object changes for whatever reason, even if it didn't change anything else in the cache. This event is available in both Configuration and ManagementPacks mode.
So, when do you want to use each mode? Configuration is great is you are doing lots of operations in the configuration space, especially moving up and down the various type hierarchies. It is also useful when working extensively with MonitoringObject (not PartialMonitoringObject) and having the need to access the property values of many instances of different class types. Our UI runs in this mode. ManagementPacks is useful when configuration related operations are used, but not extensively. This is actually a good mode to do MP authoring in, which requires extensive hits to getting management packs, but not necessarily other objects. One thing that is important to note here is that every single object that exists in a management pack (rule, class, task, etc) requires a ManagementPack that is not returned in the initial call. If you call ManagementGroup.GetMonitoringRules(), every rule that comes back will make another call to the server to get its ManagementPack object if in cache mode None. If you are doing this, run in at least ManagementPacks cache mode, that's what it's for. None is great mode for operational data related operations. If you are mostly working with alerts or performance data, or even simply submitting a single task, this mode is for you. (None was not available until recently, and is not in the bits that are currently available for download).
One more thing I want to mention. ManagementPacks, when cached, will always maintain the exact same instance of a ManagementPack object in memory, even if properties change. Other objects are actually purged and recreated. This is extremely useful for authoring as you can guarantee that when you have an instance of a management pack that may have been retrieved in different ways, it is always the same instance in memory. A practical example is you get a management pack calling ManagementGroup.GetManagementPack(Guid) and then you get a rule by calling ManagementGroup.GetMonitoringRule(Guid). The rule is in the same management pack conceptually as the GetManagementPack call returned, but who is to say it is the same instance? When you edit the rule, you will want to call ManagementPack.AcceptChanges() which (if the instances were not the same) would not change your rule, since the rule's internal ManagementPack may have been a different instance, and that's what maintains any changed state. This is not the case in Configuration and ManagementPacks cache mode. The instance that represents a certain management pack will always be the exact same instance and maintain the same state about what is being editing across the board. Now, that brings multi-threading and working with the same management pack across threads trickier, but there are public locking mechanisms for the ManagementPack object exposed to help with that.
Lastly a quick note about how the cache works. The definitive copy of the cache is actually maintained in memory in the SDK service. The service registers a couple query notifications with SQL Server to be notified when things of interest change in the database, and that triggers a cache update on the server. When this update completes, the service loops through and notifies all clients that, when they had connected, requested to be notified of cache changes. Here we see another benefit of None cache mode; less server load in that less clients need to be notified of changes.
Comments
Anonymous
October 12, 2006
It seems quiet interesting to understand whatz going on inside SCOM's brain. It will be also helpful if u could help to explain a bit more detail on , how the SDM works in SCOM. And may be a bit details on the "distributed application" and how entities,relationships all works inside. I was also curious if it is possible to show/drill-down from the model-diagram, to show what calculations/formulas involved in deriving the final value, to the root-cause of the problem.Anonymous
October 13, 2006
I'll work on a post to answer your first few questions, but what in particular did you mean by your last? Regarding the model-diagram drill down.Anonymous
October 15, 2006
The "Distributed application designer", helps to create a model with relationships. So the status of individual child entities will be rolled-up to the top. So normally the operator will drill-down from the top, to see what really is affecting his service. What i meant was, how to show the formulas involved, in the diagram itself? For example, i have the health measure of 10 hard-disks. And i also want to show the average health of all my hard-disks. So in my model it will show like, the average health figure will be at the top, and the individual health of each hard disk will be a child attached to the avg figure. So i would like to show the calculation involved to get the parent also in the diagram itself. Let me kno, if this explains what i meant?Anonymous
October 16, 2006
Makses sense. Currently when you setup a distributed application, it's basically a group that can contain anything and configure how the state rolls up to the distributed application level. The choices for rollup are "BestOf", "WorstOf" and "Percentage", however, this information is not available via the diagram view; it is stored as configuration of a monitor that was created for the distributed application to roll up state. This can be viewed again by either going back to the distributed application definition, or be locating the monitor in question in the monitors view. If this is a feature you would like implemented, please submit the appropriate feedback here: http://www.microsoft.com/mom/feedbackAnonymous
October 16, 2006
Yes i 've noticed the "Groups" in the distributed application designer, But if we don't really group them with out any formula, then the model does not really show much of the value rite? You 've also mentioned about "BestOf", "WorstOf" and "Percentage" from the Monittors. But let us consider a bit more complex scenario by adding a few more child nodes such as "SAN Health" , So my parent will be a combination of "Avg HardDisk Health" + "SAN Health". I could even add a weight to one of those values to say which is more priority to me. So that my quality of service value for storage services is calculated more precisly. Not sure whether "Monitors" Can do this type of calculation. Anyway i 've posted about it in the feedback form, at https://connect.microsoft.com/feedback/ViewFeedback.aspx?FeedbackID=228197&SiteID=209Anonymous
October 17, 2006
Currently monitors only support the three modes I mentioned. You could potentially get a bit more complex by creating sub groups of a more speciailized nature with their own rollup logic. So in your example, you would create one group that has a Percentage based roll up for hard disk health, then another group representing SAN health, and roll those two groups up to the root with WorstOf logic.Anonymous
October 31, 2007
Based on tests I've run, it seems to me that caching uses the most recent version of any loaded MP, regardless of where it comes from, which is unexpected to me. So for example, if I'm working with 6.0.5000.0 versions of the MP standalone on the file system, and I'm creating new MPs, all references to any added MPs reference the 6.0.5000.0 version of the management packs. Then if I load version 6.0.5000.28 of some of the referenced MPs from a management server and repeat my previous attempt to create a new MP using only the file based MPs that are version 6.0.5000.0, what happens is that the referenced MP that is added becomes the 6.0.5000.28 version and not the 6.0.5000.0 version which is all that are located in the areas that I have told the SDK methods to search for the MPs. It appears to me that what is happening is that the if the Cache has a later version of a MP, it will always use that for any implicit references, even if the cached copy came from a location that is not part of the search location that you specify as part of your search path for MPs in the SDK methods. Is there some way of turning off this behavior for the cache, so that the cache only returns information about MPs in the search paths that I specify for any SDK methods that I call, without losing the performance advantages of your cache?Anonymous
October 31, 2007
Well, the cache only works on data that comes from the server. If you are working with a management pack that you created from a file (or a new one you created in memory) and you specified the reference store to be a file path, then the cache does not come into play. If using this management pack, however, you subsequently set something in that management pack to reference an object you got from the online store, the reference to the higher versioned management pack would be added (since you referenced an object from the online store). Does this sound accurate based on what you are doing?Anonymous
October 31, 2007
Thanks for the response. Your description of how things should work, is how I thought things should work as well. However the behavior I think I'm seeing is different. I guess the next logical step is for me to try to write a simple powershell script to try to demonstrate this behavior.Anonymous
November 01, 2007
The comment has been removedAnonymous
May 30, 2008
This week I have been working a lot on caching in the SDK, trying to optimize some code paths and improve performance as much as I can, so I decided to share a bit about how the cache works, and some insights into its implementation while it's fresh oAnonymous
June 06, 2008
This week I have been working a lot on caching in the SDK, trying to optimize some code paths and improve performance as much as I can, so I decided to share a bit about how the cache works, and some insights into its implementation while it's fresh oAnonymous
January 19, 2009
Jakub I'm using the OM SP1 SDK to connect to an OM 2007 SP1 management server. I'm using C# code to write to the SDK to create a new MP. At each step in the process I am saving any changes I make back to the management server by calling AcceptChanges. I run into problems close to the beginning of my code, when I update the References property of my ManagementPack instance to reference another management and then ensure that the changes are saved to the server. I then continue working with the instance of my ManagementPack object, which involves reading other properties from the object. I have written code to ouput the members of the References property, so I can determine what was causing the References property to be deleted - and at the same time I output the LastModified field of my ManagementPack instance - to ensure that no other code has written to the instance during my testing (my testing is on a dedicated test server I have running on a virtual machine which no one else has access to). I can tell from my debug output code that reading information from the ManagementPack instance via the server, will result in the References property being deleted - even though the LastModified field of the ManagementPack instance is not changed when it is output by my code (it is the same value as when the Reference member held data). This behavior occurs only when the CacheMode for the configuration is set to Configuration or ManagementPacks - if I set the CacheMode to None everything works as expected. What version of the OM 2007 SDK has included your cache code changes - and where can I get it from? Do you have any suggestions for anything that you I could do to work around the problem (other than setting CachingMode to None which causes me too much of a performance hit to be usable).Anonymous
January 19, 2009
I'm guessing we actually have a bug where a reference only change doesn't trigger a cache refresh. Can you trying combining the reference change with changing something else and see if that works? Also, it shouldn't be necessary to explicitly add references as the object model will do it for you when you reference something for which a reference doesn't exist.Anonymous
January 20, 2009
Thanks for the quick response Jakub. I tried changing the MP's Version at the same time I changed the references for the MP - but this too failed. Is there something else that you think I should be changing that might work better? As well, though I didn't think we had to explicitly add references, there are places where I can't get away without adding them - otherwise I would get an exception thrown such as the following, indicating to me that they had to have been added as a reference explicitly - or am I missing something: The requested ManagementPackElement [Type=ManagementPackClass, ID=Microsoft.Windows.LocalApplication] was found in ManagementPack [ManagementPack:[Name=Microsoft.Windows.Library, KeyToken=31bf3856ad364e35, Version=6.1.6407.0]] but this ManagementPack is not listed as a reference to Host ManagementPack [[Bug249.2]]. And were your June 2008 cache changes ever released into a version of OM 2007? Is there any way I can try testing with them?Anonymous
January 22, 2009
Sorry for the delay. Try changing an actual element at the same time, rather than just properties on the MP itself. Normally version doesn't chnange if something else doesn't change too. If you add the references explicitly while doing whatever requires that operation, you should be ok.