MSMQ and Session Acks - what's the story?
I've been looking through the online Resource Kit to read up on the MaxUnackedPacket registry value and found some really unhelpful documentation for it so I thought I should explain what's really going on in case people try to use the ResKit's information.
"Specifies the maximum size of the message queue to each network session. The value of this entry limits the number and size of messages that can be stored between internal acknowledgments from the receiving server. If the queue fills before an acknowledgment is received, additional messages are discarded.
If the queue is too small, messages might be lost. But because messages in the queue are released again when a session fails, a large queue can flood the system with duplicate messages."
Gah! How confusing is it to refer to a "message queue" in an article for MSMQ when you are talking about a different sort of message queue. And messages won't be discarded or lost. Who wrote this stuff? Must get it fixed.
The principle is as follows:
MachineA sends messages to MachineB.
MachineB responds with Session Acknowledgement packets to let MachineA know what messages have arrived successfully.
While MachineA continues to receive acknowledgements, it will keep sending messages to MachineB.
The mechanism to keep track of unacknowledged messages is controlled by the MaxUnackedPacket registry value. The default is 64 which means that MachineA can store up to 64 MessageIds of the messages it has sent. Once an acknowledge packet arrives, MachineA can remove those messages from the list that have been confirmed as safely arrived. Should the list become full then MSMQ on MachineA stops trying to send any new messages until the Session Acknowledgement arrives.
To make sure this doesn't happen, MachineB will send an acknowledgement at the half-way point (by default, 32 messages) which should hopefully arrive back at MachineA long before the list is full.
Q. But what happens if MachineA doesn't send many messages? When will MessageB acknowledge them?
By default, MachineB will send a Session Acknowledgement after 500ms of idle time, configured by:
"Sets the maximum idle delay time between receiving a message and sending an acknowledgment to the source.
Idle time is the time during which no messages are sent on the session. It ensures that computers creating a session for a limited time only will get all acknowledgments possible during the session."
Q. And what happens if an acknowledgement from MachineB is lost? How will MachineA know what messages have arrived successfully?
MachineA has a timer running that lets it know when acknowledgements are late:
"Specifies how long Message Queuing waits for internal message acknowledgment of a network session before deciding that the session should be closed.
By default, Message Queuing calculates an optimal acknowledgment timeout based on the quality of the communication line. However, you can override the result of the calculation by adding this entry to the registry and setting it to a different value."
Once a session is closed, all messages in the unacknowledged list are resent and the process starts all over again again.
Q. Are there other acknowledgement from MachineB I should worry about?
In addition to Session Acknowledgements, MachineB will also send Order and Store Acknowledgments. The former is to keep transactional messages in sequence and the latter is to confirm when persistent (i.e. recoverable and transactional) messages have been written to disk. The registry values that affect these are respectively:
SeqMaxAckDelay - Defines the length of delay in sending acknowledgements for transacted messages.
SeqResend13Time - Specifies how often outgoing, transacted messages are resent because they are unacknowledged. The value of this entry specifies the interval between resends for the first three times the message is resent.
SeqResend46Time - As above but the value of this entry specifies the interval between resends for the fourth, fifth, and sixth times the message is resent.
SeqResend79Time - As above but the value of this entry specifies the interval between resends for the seventh, eighth, and ninth times the message is resent.
SeqResend10Time - As above but the value of this entry specifies the interval between resends for the tenth and subsequent resend attempts.
StoreAckTimeout - Specifies how long Message Queuing waits for internal message acknowledgement of persistent messages on a network session before deciding that the session should be closed.
Notes
- a broken session is not connected immediately. MSMQ will wait a number of seconds, depending on the operating system and the WaitTime registry value.
- the Nagle algorithm may add short delays to the arrivals of acknowledgements (241777 FIX: MSMQ Delays in Sending Messages)
Comments
Anonymous
February 18, 2008
Hi John, Excellent article! Much better than the "official" documentation... BTW a "packet" and a "message" are not synonyms. Each MSMQ message consists of several packets (typically three for user message, two for MSMQ internal message like acknowledge). You can see that by using a network sniffer, like Microsoft Network Monitor. I think the transactional resend times discussion is very important. In most cases, the default selected for the longer times (especially the 6-hour SeqResend10Time) are way too long, and may cause the outgoing queues to halt for a very long time following a network failure. Besides changing the registry values, these problems can be avoided by abandoning transactional messages and using recoverable messages (most implementations don't really need transactional messages anyway). Cheers, YoelAnonymous
February 19, 2008
You are, of course, correct - my use of "packet" and "message" interchangeably was sloppy.