Jaa


Windows Azure Guidance – The “Get”, “Delete” pattern for reading messages from queues

Fabio asked me on twitter “why there’re no dequeue, peek and enqueue on Windows Azure Queues?”  

One of the most common patterns for interactions with queues is this:

 

image 

  1. You get the message from the queue. This is not a “dequeue”, even though it looks like one. It is more a “peek & hide”. The message is retrieved and it is made invisible for others.
  2. The worker (or whatever got the message from the queue) does something useful with it.
  3. When work is complete, the message is explicitly deleted.

If something goes wrong with the Worker, then after some (configurable) time, the message becomes visible again and someone can pick the message again. Remember: anything can fail anytime!

 

image 

 

If you had a “dequeue” method, (dequeue = peek + delete), then there’s a non-zero chance your message is lost.

 

Things to consider:

1- Your message could be processed more than once:

a- If the “DoSomething” method takes longer than the time the message is invisible.

b- If your worker crashes just before you delete the message.

2- You must develop your system to handle duplicates.

3- There’s a chance that the process failing is actually due to a problem with the message itself. This is called a poison message. There’s a special property you can use (dequeuecount) to do something about it. For example, you can discard messages that have been dequeued beyond a certain threshold:

if( dequeucount > MAX_DEQUEUES )

      MoveMessageToDeadLetterQueue( message );

 

Fabio, is the floor steady again? :-)

Comments

  • Anonymous
    May 11, 2010
    I did a full write-up of the various techniques that facilitate de-duplication of messages.  It's found on my blog for anyone that needs some help determining how to handle this.

  • Anonymous
    May 11, 2010
    Cool. Thanks for sharing Jonathan!

  • Anonymous
    May 12, 2010
    Perhaps the point "2- You must develop your system to handle duplicates." is where P&P may publish some proposals. What I prefer is : dequeue and wait for response. The "delete" message can be substitutes with a "I'm done" leaving the responsibility... well a little bit large to discuss here... but with a different behavior we can:

  • avoid to worry about 1.a
  • if the message-processor crash it will never send the "I'm done" and the message can be re-queue in the correct queue
  • duplicates: if a message is in the "waiting for process response or processed" queue then the message was sent to somebody. ... may be ...
  • Anonymous
    May 12, 2010
    "is where P&P may publish some proposals." Yes, that's exactly what our intention is. I see what you are saying, but i don't think it solves the issue of processing duplicates. it does solve the "duplicate processing due to timeout" Let me rephrase: 1- Get message: the message is retrieved from the queue and moved to a special state ("waiting for DONE") 2- When done, you call "DONE" 3- If system crashes in the meantime, you can go and inspect messages in the "waiting for DONE" and re-process, etc The problem is that you still have to guard agaist duplicate processing. The worker mught have crashed just before calling "DONE". So, you still have to code your system for dealing with dups. makes sense?

  • Anonymous
    May 12, 2010
    aaaaaaaaaaahhh that is another matter. For me there is a difference between "two process of the same message in the original queue" with "the processor have processed the message but never sent 'DONE'". we should talk... it is too long to explain here.

  • Anonymous
    May 13, 2010
    ...and the "dequeuecount" property is precisley there to know if someone has ever dequeued that particualr message at least once. Hence my "dequecount is your friend"