Too Many Address Types Can Cause Cached Mode to Not Work

My largest customer ran across this one. Here is the scenario:

  • Exchange 2003 Mixed mode with mostly 5.5 servers
  • New profiles created against a mailbox on an Exchange 2003 server can't send mail when in cached mode.
  • New profiles created against a mailbox on an Exchange 5.5 server do work in cached mode.

So, it turns out that the clients would create an "instant NDR" because a key was missing. The customer compared the two profiles stored in the registry at HKEY_CURRENT_USER\Software\Microsoft\Windows Messaging Subsystem\Profiles\<Profile name>\13dbb0c8aa05101a9bb000aa002fc45a and we found out that the primary value that was missing was "01026687". This key is used to contain the routing. It is created from the GWART in Exchange 5.5 and pulled form the AD in Exchange 2003. For some reason when you have a TON of address types, either the server doesn't send the data or the client doesn't accept it. To be honest, I really don't know which it is, but I was told that it had to do with a 32K limit. But that wasn't proven to me. What I do know is that this key only seems to get created at the profile generation time and not get modified later. So if you don't get his the first time you create the profile, you are pretty much in bad shape. We determined this with Regmon.

How many Address types were in this customer's scenario? Well using WinRoute we determined that there were about 2400 Address types. 5 different connectors had 400 X400 address types on them and this was really not necessary. Really all they needed was 1 per connector. This was left over from their old X400 cloud days. We got rid of 2000 of these addresses and it all started working. Before doing this the GWART0.mta was almost 1MB!

Now the question of the hour... Should this be fixed? Well, if you were the software company that wrote this and had a bunch of other issues that needed to be prioritized, where would this fall? In my scenario, the customer was more than happy with the workaround. It sort of is like creating over 1000 empty Routing Groups for no reason and then complaining to Microsoft that you hit the limit. Should they fix it? No. How many people are going to hit that limit? Probably not that many. How many people will be affected by the code if you make the fix and it messes up something else? Probaby everyone. So the decision is not to make the change, because the workaround is sufficient for the majority of the users. If our largest customers hit this limit all the time, then we would definitely make the change if there was no workaround.

Sound fair?