Compartilhar via


Handling Email Storms in Exchange 2007

I was recently involved in an issue with a customer in helping them remove millions of messages related to an Email storm after getting relayed into their Exchange 2007 infrastructure. The mail storm related messages were being relayed into Exchange from a source of trusted systems which misbehaved and before the customer could react (as this was a weekend) Exchange 2007 Transport Infrastructure was flooded with these messages. To make things worse, based on their Transport design multiple regional HUB Transport servers were accepting messages from these trusted systems thus spreading the impact to different regions. Since the recipient Mailboxes were above the limits in no time Queues were also getting flooded with NDR and to make things worse Message Journaling was enabled all I can say is Oouch Oouch Oouch…For any Exchange Administrator this is one of the scenarios that is wished no one has to go through …  

My Blog today is to go through steps on how to remove these Storm messages from the Exchange 2007 Transport queues. I assume you have identified the message causing the Storm as this can be easily identified by viewing messages in the affected Exchange 2007 Transport server Queue using Queue Viewer.

 

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Prevent Storm Emails from getting into Exchange

Once you have identified the Storm message detail like Subject-line the first thing you need to do is to stop/prevent these messages from getting into Exchange. This is done by creating a New Transport Rule. Refer to the TechNet Article on how to create a new Transport Rule: https://technet.microsoft.com/en-us/library/bb125138(EXCHG.80).aspx

Identify the affected Exchange 2007 HUB Transport servers that have large Queues.

I will assume the Exchange 2007 Transport Servers have resource monitoring enabled. We recommend you have Resouce Monitoring enabled. The most common event is 15004 in Application Log on the HUB Transport server which is related to VersionBucket (A list of changes that are made to the message queue database that is kept in memory until those changes can be committed to a transaction log). Once the server starts getting these events depending on the resource utilization level certain SMTP functionalities are temporarily disabled.

Event log entry for an increase in any resource utilization level
Event Type: Error
Event Source: MSExchangeTransport
Event Category: Resource Manager
Event ID: 15004
Description: Resource pressure increased from Previous Utilization Level to Current Utilization Level.

Refer to TechNet on BackPressure Details : https://technet.microsoft.com/en-us/library/bb201658(EXCHG.80).aspx

 

Pause ‘Microsoft Exchange Transport’ Service

Once you have identified all the affected Exchange 2007 Transport servers logon to these Transport servers and quickly PAUSE the ‘MSExchangeTransport’ Service.

 

Disable Resource Monitoring in EdgeTransport Config file

We would then disable ResouceMonitoring on the affected HUB Transport Servers. Reason to make sure we are able to process all mails after cleanup and relay them out without any issue. Identify the Exchange 2007 Install directory and Browse to the following location e.g. d:\Exchsrvr\bin and open EdgeTransport.exe.Config file

Identify the Line with the following text <add key=“EnableResourceMonitoring” value=”true” />

Change the value from true to false as shown below.

Restart the MsExchangeTransport Service and Pause it immediately.

Confirm the Submission Queue and the Active remote Queue is processing messages by monitoring these counters in Perfmon

 

Procedure to cleanup Queues on Transport server

This are the steps you should follow to clean all queues (Submission, Local/Remote Delivery Queues) filled with messages. I will take 80,000 messages as a batch when performing queue cleanup.

  • ·         Confirm the MsExchangeTransport service is PAUSED state and the EdgeTransport.exe.Config file is updated.
  • ·         Open EMS and execute cmd
    • o   'Get-queue | Suspend-queue' #(THIS WILL SUSPEND ALL QUEUES ON THIS SERVER)
  • ·         Under EMC go to the Transport server and Resume 'Submission' and allow about 80000 messages to flow to build up in different queue (For example: Queue ID “Server1\123456”). Once “Server1\123456” reaches 80000 suspend Submission queue again.
  • ·         COUNT GENIUNE MESSAGES ARE IN THE QUEUE
    • o   Open EMS and execute cmd
    • o   'get-message -queue Server1\123456 -resultsize unlimited | where {$_.Subject -notlike "*undeliverable*"} | measure-object'   
  • ·         SUSPEND ALL UNDELIVERABLE MESSAGES IN THIS PARTICULAR QUEUE
    • o   Open EMS and execute cmd
    • o   'get-message -queue Server1\123456 -resultsize unlimited | where {$_.Subject -notlike "*undeliverable*"} | Suspend-Message'  
  • ·         EXPORT ALL GENIUNE MESSAGES (Creating EML format Messages in case you need to replay)
    • o   Open EMS and execute cmd
    • o   'get-message -queue Server1\123456 -resultsize unlimited | where {$_.Subject -notlike "*undeliverable*"} | export-message -path e:\Export1'  
  • ·         DELETE ALL UNDELIVERABLE MESSAGES OFF QUEUE
    • o   Open EMS and execute cmd
    • o   'get-message -queue Server1\123456-resultsize unlimited | where {$_.Subject -like "*undeliverable*"} | remove-message -withNDR $false'  
  • ·         Open EMS and execute
    • o   'get-queue' to ensure undeliverable messages are deleted.

 

{Follow above steps until all messages off submission queue are drained and Every time you export the messages, specify a new Export Folder like Export1/2/3/4}

Perform the above cleanup steps on all affected servers.

 

Enable Resource Monitoring in EdgeTransport Config file

Now that you have cleaned up queues on all the affected servers it is time to enable the Resource Monitoring on all Tranpsort servers where you had disabled it. Browse to the Exchange Install Directory location e.g. d:\Exchsrvr\bin and open EdgeTransport.exe.Config file

Identify the Line with the following text <add key=“EnableResourceMonitoring” value=”false” />

Change the value from false to true.

Save the file and restart the MsExchangeTransport Service and confirm the service is running.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

Hope this helps and I will follow-up with an Exchange 2010 Procedure.

Comments

  • Anonymous
    November 14, 2011
    Must have. This is a great set of instruction on how to handle large mail storm

  • Anonymous
    November 14, 2011
    Thanks for this very clear step-by-step process for dealing with email storms. Definitely very informative and a must read for any exchange administrator who does day-to-day support who will at some point in his career will have to deal with email storms. I should bookmark this page for quick access.

  • Anonymous
    November 14, 2011
    This is very good detail set of instruction to handle the mail storm issue.

  • Anonymous
    November 13, 2013
    Thanks for sharing the steps in dealing with this situation should we encounter it. Do you have similar set of steps for Exchange 2010 as well? Can you please share them?