SCCM state messaging–in depth
State messaging is a new mechanism in SCCM which reflects point in time conditions on the client. Status messages, by contrast, work to help administrators track the flow of data through various SCCM components. There is even a status message viewer built right into the console so these messages can be viewed and tracked. There is no such equivalent for state messages. The result of state messages is largely only seen in reports, various data in the console (such as number of systems needing a patch) or the client logs themselves.
State messages are very small – communicating very succinct information regarding conditions in place on the client. We will see state messages in action shortly.
The state messaging system is used by specific components of SCCM – the most well known being software updates, desired configuration management, network access protection and the client. There are a few other places as well but these are the major components.
Understanding state messaging in SCCM is important – critical software update data relies on this system – but in v.next of SCCM understanding state messaging is even more important as more and more components take advantage of it. That said, lets dig into how the state messaging system works. To summarize I pulled together the general state message flow below.
Let’s walk through the graphic. The green box represents the state messaging system and the components inside are those that feed information into the state messaging system. When state messages are received two things happen – first, state messages are stored in WMI. Then, on a 15 minute cycle (by default – this can be changed), the state system will scrape WMI for any state message that has not yet been sent and forward up to the MP.
Note in the graphic I have pulled the client installation piece out separately. During client install and before installation has proceeded sufficiently to be able to find or access the MP – or if the MP is down for some reason – state messages about client install will be forwarded to the FSP if configured. For everything else, traffic goes directly to the MP.
State messages arriving at the MP will be processed into .SMX files by the MP_Relay component and placed in the auth\statesys.box\incoming folder on the site server. From there they are processed into the database to complete the flow.
The graphic hopefully does a good job of communicating the general flow – but lets dig deeper for a better understanding of what happens in detail. To get started we need to make sure verbose logging is enabled for the client, MP and state messaging component on the site server.
To set verbose/debug logging on an SCCM client or MP simply edit (or create) the following registry keys and additional logging should kick in.
HKLM/Software/Wow6432Node/Microsoft/CCM/Logging/@Global/LogLevel to 0
HKLM/Software/Wow6432Node/Microsoft/CCM/Logging/DebugLogging/Enabled to True
On the site server edit the following registry key to enable verbose logging and then restart the SMS_Executive (or the state system component)
HKLM/Software/Wow6432Node/Microsoft/SMS/Components/SMS_STATE_SYSTEM/Verbose Logging – Reg_DWORD set to 1
Typically, tracing SQL commands requires enabling SQL tracing for SCCM components – in this case you won’t see much. Even with Profiler you won’t see a great deal – more on SQL stuff in a bit.
With all of this configured, lets start tracing. I’ve pulled some log snips (and at this point thanks go to my colleague – Vinay Pamnani – who had done some of the log gathering leg work for me) that line up the updatesdeployment log and statemessage log on the client – showing how between the two you can get a good idea of what is happening at the various steps along the way simply by looking at the state messages.
Before we actually look at the log snips a bit about the state message format. The state message itself is made up of a few parts – two of the key parts being the Topic Type and State ID. As you look at the logs you will notice in some places the Topic Type seems to already be interpreted for us and in the logs you won’t always see both the Topic Type and state ID together. Between the two logs, however, you generally can piece together what is needed. Further, having just the Topic Type without the state ID doesn’t really help – and even with both the information is not helpful unless there is a way to interpret it. Thankfully, there is a technet link with a chart showing how using a combination of the Topic Type and state ID we can get meaningful data. The link and a portion of the data from the second chart is reproduced below and is useful as we walk through our log samples.
https://technet.microsoft.com/en-us/library/bb932203.aspx
Topic Type |
State Message ID |
State Message Description |
300 |
0 |
Compliance state unknown |
300 |
1 |
Compliant |
300 |
2 |
Non-compliant |
300 |
3 |
Conflict detected |
301 |
0 |
Enforcement state unknown |
301 |
1 |
Installing update(s) |
301 |
2 |
Waiting for restart |
301 |
3 |
Waiting for another installation to complete |
301 |
4 |
Successfully installed update(s) |
301 |
5 |
Pending system restart |
301 |
6 |
Failed to install update(s) |
301 |
7 |
Downloading update(s) |
301 |
8 |
Downloaded update(s) |
301 |
9 |
Failed to download update(s) |
301 |
10 |
Waiting for maintenance window before installing |
302 |
0 |
Evaluation state unknown |
302 |
1 |
Evaluation activated |
302 |
2 |
Evaluation succeeded |
302 |
3 |
Evaluation failed |
400 |
0 |
Detection state unknown |
400 |
1 |
Not Required |
400 |
2 |
Not Detected |
400 |
3 |
Detected |
401 |
0 |
Compliance state unknown |
401 |
1 |
Compliant |
401 |
2 |
Non-Compliant |
401 |
3 |
Conflict Detected |
401 |
4 |
Error |
402 |
0 |
Enforcement state unknown |
402 |
1 |
Enforcement started |
402 |
2 |
Enforcement waiting for content |
402 |
3 |
Waiting for another installation to complete |
402 |
4 |
Waiting for maintenance window before installing |
402 |
5 |
Restart required before installing |
402 |
6 |
General failure |
402 |
7 |
Pending installation |
402 |
8 |
Installing update |
402 |
9 |
Pending system restart |
402 |
10 |
Successfully installed update |
402 |
11 |
Failed to install update |
402 |
12 |
Downloading update |
402 |
13 |
Downloaded update |
402 |
14 |
Failed to download update |
On to the log snips. The graphic may be a bit hard to read so you may want to enlarge it. A few comments about the log snips.
First, I have tried to tie the updatesdeployment and statemessage log together. One way you can easily track to make sure the logs are referring to the same state message is by topicID which is listed in green. Knowing the two logs are referring to the same state message will help keep things lined up correctly. The Topic Type is highlighted in light blue. Note between the logs one might show the actual number which can be interpreted from the table above and the other may show a generic value (which is already interpreted). The state ID is in purple. Using a combination of the Topic Type and State ID along with the table above we can exactly track what is happening in the logs. In this case the log snips show update evaluation, the result of evaluation, the update being downloaded, the update being installed and finally a pending system restart. Take a few minutes and examine the log entries against the table above.
The graphic above is just one example of how state messages are sent into the state system. Lets go through an actual example now showing state messaging data making it’s way to the MP and being processed to the database. We will use a test client and start by forcing the client to scrape WMI for all state message information and then send that information to the MP on it’s next polling cycle.
Lets first take a look at WMI. State messages are stored in the root\ccm\statemsg namespace and in that namespace we have two classes of interest. The first is the CCM_StateMsg_SerialNum class. This class is simply used to record the last serial number used for a state message. Like hardware inventory, every state message has it’s own unique serial number so that the site server can keep track of whether it is actually missing any state messages from the system. This is important because missing state messages may lead to incomplete or inaccurate state reporting. More on that in a bit.
The other class in the statemsg namespace – CCM_StateMsg - contains the state messages themselves. If we look at the class instance we will see all of the state message recorded.
And opening one up will show the details of the state message and some data we have already discussed.
OK, cool – so lets resend this data to the MP and track it’s progress. I will use a simple state message resync script to make this happen.
RefreshServerComplianceState()
Sub RefreshServerComplianceState()
dim newCCMUpdatesStore
set newCCMUpdatesStore = CreateObject ("Microsoft.CCM.UpdatesStore")
newCCMUpdatesStore.RefreshServerComplianceState
wscript.echo "Ran RefreshServerComplianceState."
End Sub
This script can be found on the web in various places and simply makes use of the SCCM SDK to cause the client to resend it’s data to the MP. It’s a convenient way to force some state messages up so we don’t have to wait on components to insert them.
Typically, a vb script is executed simply by using cscript as show below. But note that this fails. I show this to illustrate the problem you might see if trying to run the script yourself. SCCM is a 32 bit application but, in this case, running on a 64 bit server. The default version of cscript is the 64 bit version – and generally this works fine with any vbscript. In this case, however, the call being made requires the 32 bit version of cscript which you must run out of the syswow64 folder as shown in the second example.
At this point we have to wait for the next state message polling cycle which will cause all state messages to be sent back up.
Our polling cycle has now fired and if we look at the statemessage.log we see that the state information has been pulled, formatted (into XML) and sent to the MP.
<![LOG[StateMessage body: <?xml version="1.0" encoding="UTF-16"?>
<Report><ReportHeader><Identification><Machine><ClientInstalled>1</ClientInstalled><ClientType>1</ClientType><ClientID>GUID:F1B03089-9EA1-4A64-83DC-5F8E77BDDEB4</ClientID><ClientVersion>4.00.6487.2000</ClientVersion><NetBIOSName>STEVERAC64-2</NetBIOSName><CodePage>437</CodePage><SystemDefaultLCID>1033</SystemDefaultLCID></Machine></Identification><ReportDetails><ReportContent>State Message Data</ReportContent><ReportType>Full</ReportType><Date>20110107184030.662000+000</Date><Version>1.0</Version><Format>1.0</Format></ReportDetails></ReportHeader><ReportBody><StateMessage MessageTime="20110107183046.153000+000" SerialNumber="1431"><Topic ID="21e49ac6-a273-4a61-9794-eb675bc743e5" Type="500" IDType="3"/><State ID="2" Criticality="0"/><UserParameters Flags="0" Count="1"><Param>102</Param></UserParameters></StateMessageserParameters></StateMessage></ReportBody></Report>
]LOG<![LOG[CStateMsgManager::GetSignEncyptMode]LOG]!><time="12:40:31.037+360" date="01-07-2011" component="StateMessage" context="" type="1" thread="3592" file="statemsg.cpp:1820">
<![LOG[Successfully forwarded State Messages to the MP]LOG]!><time="12:40:31.099+360" date="01-07-2011" component="StateMessage" context="" type="1" thread="3592" file="statemsg.cpp:1527">
I’ve truncated the XML due to size – leaving only a single state message – we will come back to the XML shortly as it really isn’t very friendly in this format.
Note that although the statemessage.log records that the messages have been dispatched to the MP, the statemessage system doesn’t actually do the movement of data to the MP – CCMExec does that as we see from the small snip below. There is actually a bit more that goes on behind the scenes at this point but it’s sufficient to know that CCMExec sends data to the MP – in this case the MP_Relay component.
<![LOG[OutgoingMessage(Queue='mp_mp_relayendpoint', ID={A9E7A07D-223D-4F5D-93D5-15AF5B72E05C}): Delivered successfully to host 'STEVERACPRI.STARTREKNG.COM'.]LOG]!>
Switching to the MP now, when data arrives for processing at MP_Relay (again, a bit more behind the scenes stuff happens but not worth diving into here since most folks will never come across a need to dig in) we see the data processed and translated to .SMX file format and placed in auth\statesys.box\incoming.
Inv-Relay Task: Processing message body
Relay: FileType= SMX
Relay: Outbox dir: C:\Program Files (x86)\Microsoft Configuration Manager\inboxes\auth\statesys.box\incoming
Relay: Received 0 attachments
Relay: 0 of 0 attachments succesfully processed
Inv-Relay: Task completed successfully
If we take a look at the auth\statesys.box\incoming directory – and if we are fast enough – we will see the .SMX files there and being processed. Typically you won’t see much here – if you do it’s worth investigating what messages are here and not being processed. If we find a .SMX and crack it open with notepad we can see the detail but the formatting leaves much to be desired.
A little trick here. If you rename the .SMX and simply add a .XML extension – so <filename>.SMX.XML and double click it you get a much nicer formatting of the data through IE. I’ve highlighted the info in the file that is key – we can easily see the machine details and also state message details we have become familiar with – and also the unique state message ID.
Note: If you are going to rename these files then you should copy them out first so you aren’t impacting the statesys.box folder itself.
From here all that remains is to process these state messages into the database. You can see these messages processed in the statesys.log – something similar to the following.
Found new state messages to process, starting processing thread
Thread "State Message Processing Thread #0" id:5076 started
CMessageProcessor - Detected parent site 'CEN'
CMessageProcessor - Processing file: mdlbp169.SMW
CMessageProcessor - Processed 1 records with 0 invalid records.
CMessageProcessor - Successfully replicated file "mdlbp169.SMW" to parent site CEN.
CMessageProcessor - Processed 1 message files in this batch, with 0 bad files.
Thread "State Message Processing Thread #0" id:5076 terminated normally
The DB processing component often can be seen by enabling SQL tracing. That doesn’t help much here so we turn to SQL profiler – which does give us a hint at what is happening but still doesn’t’ completely pull back the covers. Suffice it to say that there are a number of SQL stored procedures responsible for processing state messages – and also a number of tables in the database that store state message data. The stored procedures that do the state message processing generally start with the name spProcess – there are a number of them. Those interested should be able to dig in further from here to see the tables involved, etc.
OK, with processing into the DB complete we have finished the state message process flow – well, almost. The last thing to discuss is state message resyncs.
I have already alluded to the fact that the site server keeps track of state messages as they come in – so that if any state messages are missing they can be flagged and a resync requested from time to time. State messages and their data are important so we definitely don’t want to miss any.
As state messages come in the unique ID is read and stored in the database. As processing continues this data is constantly updated and if a gap is detected we flag it and store any missing state messages in the SR_MissingMessageRanges table. One would hope that this table will be empty but in production we may see data in the table here and there. This table is used to help keep track of state messages needing resync.
Any needed resyncs are evaluated hourly – but that does not mean that systems will be running a resync hourly. The site control file specifies a few settings of interest (with the site control file always apply the caution – look, don’t touch! )
BEGIN_COMPONENT
<SMS_STATE_SYSTEM>
<6>
<STEVERACPRI>
PROPERTY <Loader Threads><><><4>
PROPERTY <Inbox Polling Interval><><><900>
PROPERTY <Loader Chunk Size><><><256>
PROPERTY <Max Chunks Fetched><><><100>
PROPERTY <Resync Check Interval><><><60>
PROPERTY <Min Missing Message Age><><><2880>
PROPERTY <Heartbeat Msg Interval><><><15>
END_COMPONENT
PROPERTY <Resync Merge Interval In Hours><><><72>
Resync Check Interval is set to 60 minutes – this is the schedule at which we check for any systems needing state message resyncs.
Min Missing Message Age is set to 2880 – this is the amount of time a message has to be missing before a resync is requested.
Resync Merge Interval in Hours is set to 72 – this is the amount of time between resyncs of a client. This interval helps ensure that clients won’t resync over and over again. You should not see a client resync more frequently that the interval set here.
Comments
Anonymous
February 16, 2011
"Suffice it to say that there are a number of SQL stored procedures responsible for processing state messages..." Really? All I could ever find was CLR compiled code, which you can't open like a regular SP.Anonymous
March 09, 2011
Yes - quite a few actually. Trust me on this! :)Anonymous
March 23, 2011
How long are state messages stored in the database? Is there any delete cycle on state messages? Thanks.Anonymous
September 20, 2012
Do you know how to create SCCM Collection based on state messages that are in DB? (I do not preffer to extend sms_def.mof for it). If you will ask why.. here is the example answer. I would like to have dynamic collection with clients that have known issue with "waiting for another installation to complete", that prevent software updates to install. I woult like to repair them by script. i dont ask for reports - where I can get these information. Any idea ? RobertAnonymous
February 19, 2013
Hi, Is there a difference between these two ? Topic Type State Message ID State Message Description 301 6 Failed to install update(s) 402 11 Failed to install updateAnonymous
November 27, 2013
The comment has been removedAnonymous
December 17, 2013
The comment has been removedAnonymous
September 17, 2014
VERY nice write up. Have a follow up question. I have followed your outline in detail. I was not able to identify the step or element to check for a condition I'm seeing. I see the client reporting the state messages to the MP for a deployment. All the inboxes (including bad) are clear. The enforcement state for the deployment still shows "State Unknown". It appears the state has not been processed/updated yet by the DB. Where would I check to manually verify the state message processing in the DB?Anonymous
October 29, 2014
This is a very useful blog post and still very relevant to SCCM 2012 R2 state messaging. I was seeing an issue where a client had reported (in WMI) that it sent the correct state message (for software update compliance) to the MP however this was not reflected in the DB. Running the above vb script to refresh the compliance state fixed the issue. I also uncovered 1777 records in the SR_MissingMessages Ranges table and will investigate this backlog.Anonymous
June 08, 2015
Hi Steve, Thanks for this wonderful post. I discovered one problem, which many other CM admins are also facing (as per google result :)). The problem is, when I run report 'Computers is specific compliance state for an update', the value in 'Last Message Received' and 'Last State Changed' is very old (almost a month old) for most of the machines, while I can see that those machines are healthy and communicating with MP. Also, the State Messages cycle is set at 15 Minutes. I checked the inboxes and there were no backlogs (which I was expecting).Anonymous
June 08, 2015
Hi Steve, Thanks for this wonderful post. I discovered one problem, which many other CM admins are also facing (as per google result :)). The problem is, when I run report 'Computers is specific compliance state for an update', the value in 'Last Message Received' and 'Last State Changed' is very old (almost a month old) for most of the machines, while I can see that those machines are healthy and communicating with MP. Also, the State Messages cycle is set at 15 Minutes. I checked the inboxes and there were no backlogs (which I was expecting). I executed the script to forcefully generate state message on one machine and found that it was updated successfully in reports. This indicates that there is no issue at server end. Can you please help me here, how to proceed further. Shall I deploy the script to all the machines and set that deployments to run each time system is logged in. Thanks in Advance.