Document Normalization

Okay, I have seen this copied on two different blogs, so it is apparently usefull enough that others might be interested. I posted this originally on a public discussion alias in response to a question on where mappings should be done.

“While there are actually some performance related reasons to put your maps in the receive and send ports, there are much better business reasons for doing it outside of your schedule. We tend to refer to mapping in receive and send ports as document normalization. In the case of receive ports, you are normalizing the documents from the format of your customers into an internal standard format. On the outbound side, you are converting out of your normalized format and into the specific format of your trading partner or internal application. If you embed the map in the schedule and the partner changes the format, not only do you have to rebuild the map, you have to rebuild the schedule to use the new version of the map. Also, what happens when you add a new partner with a new format. That is a new map and if you have embedded the map in a schedule, it means a new schedule. This is exactly why we added support for multiple maps (one per source message type) on the receive port so that you could create a single location for all of your partners and easily handle normalize into your internal standard formats. Putting these types of maps in schedules would be a bad idea. There are times when it makes sense to use a map in a schedule. When you need to generate a new message in the schedule and use the modified (mapped) contents of an existing message as the base. When you want to map multiple parts of a message into one outbound message (this type of mapping cannot be done in a receive / send port). There are performance gains which come from doing mappings in receive ports sometimes, but they are mostly around how many persisted messages your scenario generates and it is a bit complicated to explain. The actual mapping technology is the same. To keep your internal business logic from getting tightly couple with the document formats of your trading partners, you should do your document normalization (mapping) in the send and receive ports.”

The key take away from this post is that it is important not to tie your business logic to the format of one trading partner. Performance aside (for those of you who attended my perf talk at tech-ed, yes there are some perf benefits to doing this in the ports), the goal of this design is to make your system more robust and able to change as your business grows and adds new partners and also allow you to react easier to changes in your partners data formats. Brandon Gross comments (in Jeff Lynch's blog) that there are times when the "normalization" is quite complex and it is easier to model this in an orchestration than with our support in the mapper. It is true that there are cases when you simply have no choice but to do the mapping / data conversions in an orchestration, and in those cases, that is what you do. But in general, the best practice I am pushing forward here is a decoupling of your business logic from your partners data formats and so a more robust system.

Hope this helps

Lee

Comments

  • Anonymous
    September 25, 2004
    BizTalk Server 2004 - Document Normalization
  • Anonymous
    June 25, 2005
    The comment has been removed