Share via


BizTalk Server will split up your documents for you.

Lots of people have asked me how you split up a documents. You don't have to write code for this it is a built in feature of the parsers in BizTalk Server. For example check out this sample .txt document. It has a header (Northwind Shipping) and multiple purchase orders (PO1999-10-20 and PO1999-10-21) and a trailer (END OF RECORD).

“Northwind Shipping
PO1999-10-20
US        Alice Smith         123 Maple Street    Mill Valley    CA 90952
US        Robert Smith        8 Oak Avenue        Old Town       PA 95819
Hurry, my lawn is going wild!
ITEMS,ITEM872-AA|Lawnmower|1|148.95|Confirm this is electric,ITEM926-AA|Baby Monitor|1|39.98|Confirm this is electric|1999-10-21
PO1999-10-21
US        John Dow            123 Elm Street      Mill Valley    CA 90952
US        July Dow            8 Pine Avenue       Old Town       PA 95819
Please ship it urgent!
ITEMS,ITEM398-BB|Tire|4|324.99|Wrap them up nicely,ITEM201-BB|Engine Oil|1|12.99|SAE10W30|1999-05-22

1234567“

In the XML dissassembler you can a schema for the header (Northwind Shipping), the envelope (the repeating unit), and the document (specification) and the trailer (1234567).

For more details check out the SDK sample in C:\Program Files\Microsoft BizTalk Server 2004\SDK\Samples\Pipelines\AssemblerDisassembler\EnvelopeProcessing.

Comments

  • Anonymous
    February 06, 2004
    Let's see:

    Write my own code, pay several thousand dollars for biztalk...

    hmmm...

    That's a tough call, especially for something as difficult as splitting up documents.

    Well, I never was that good at sarcasm in writing, but you get my point =)
  • Anonymous
    February 07, 2004
    Scott, technically your example is using the Flat File dissassembler, not the XML dissassembler.

    I still have not found a way to split xml records in a larger xml file without a custom dissassembler. Any examples?
  • Anonymous
    February 07, 2004
    Unfortunately sometimes you need the details in the header in each split message. This is a particular challenge when splitting documents. Say for example:

    <envelope><receiveddate>1/1/2004</receiveddate><message><....></....></message><message><....></....></message></envelope>

    So, with the built in splitter I believe you can automatically have each message wind up in Biztalk without a bunch of work, but if you need the receive date as within each message I have yet to figure that one out without mapping first to a new structure then splitting. This is a current challenge we face.

    Scott, is there a pattern for this?

    Shawn

  • Anonymous
    February 20, 2004
    Sorry Scott, let me answer on how to split XML interchanges;-)

    Yes, XML disassembler is able to disassemble interchanges of XML messages, and unwrap one or more envelopes, e.g. if you have an XML interchange:

    <ns0:envelope xmlns:ns0="ns0">
    <header>
    <firstName>Andrei</firstName>
    <lastName>Maksimenka<lastName>
    </header>
    <documents>
    <ns1:document xmlns:ns1="ns1">
    <!-- some stuff-->
    </ns1:document>
    <ns1:document xmlns:ns1="ns1">
    <!-- some stuff-->
    </ns1:document>
    <ns1:document xmlns:ns1="ns1">
    <!-- some stuff-->
    </ns1:document>
    </ns0:envelope>

    You need to define two schemas, one for envelope and one for document. In schema for envelope which schematically looks like:

    <schema>
    envelope
    header
    firstName
    lastName
    documents
    <any>

    you need to set property Envelope to 'yes' and specify Body XPath pointing to the XML element "documents".

    Schema for document should schematically look like:

    <schema>
    document
    ...

    You can use default XML receive pipeline to disassemble that interchange, or custom receive pipeline where Envelope Schema Names and Document Schema Names have envelope and document schemas specified. The latter can be helpful if you want to workaround schema ambiguity problems (when more than one schema with the same message type or targetNamespace#rootRecordName are deployed), to do document structure enforcement (all incoming documents must contain envelope and document as specified) or use XML validation during disassembling (set Validate Document Structure to 'yes').
  • Anonymous
    February 20, 2004
    The comment has been removed
  • Anonymous
    February 20, 2004
    Nice. In case everyone wondered Andrei Maksimenka is one of the key developers on BizTalk Server 2004 and owns this functionality. Thanks for jumping into the discussion Andrei :)
  • Anonymous
    February 22, 2004
    Udi, in response to your comment, see the comment thread at http://weblogs.asp.net/cameronreilly/archive/2004/02/13/72410.aspx
    it's quite similar.
  • Anonymous
    June 22, 2004
    Hi Scott,
    I also need to do this but under BTS 2002 and I can't install SDK 2004 unless I have BTS 2004. Is there another place I can find the source for this?
    Thanks, Dave
  • Anonymous
    July 18, 2004
    I have a similar but I think slightly more complex problem and am hoping for a recommended approach. I need to take an incoming flat file, say:

    SomeHeaderStuff RECORDCOUNT
    KeyValue SOMESTUFF
    KeyValue SOMESTUFF
    KeyValue SOMESTUFF
    SomeTrailerStuff RECORDCOUNT

    For each record in the incoming file I need to read a database table using the value of the KeyValue Field. Each record of the incoming file needs to end up in one of two message, say A and B.

    Message A will be used to update the database. Message B needs to be reassembled into a new flat file with re-calculated record counts. This file will be passed along to another sysetm. The record layout of the new flat file will be identical to that of the received file. Of course it will contain less records than the original file and will also have the recalculated record counts as I mentioned above. Any suggestions on where to start with this would be appreciated.
  • Anonymous
    June 16, 2009
    PingBack from http://fixmycrediteasily.info/story.php?id=14984