Processing PDFs (or anything else!) in BizTalk Orchestrations

So you're probably aware that you can pass virtually any file (*.exe, *.dll, *.xls) through BizTalk Server's messaging components, but did you also know that you can pass just about anything through an Orchestration process as well?

If you want to pass, say, a Microsoft Word document through the BizTalk Messaging Engine, you'd simply set up a receive location that grabs the file, make sure you use the pass-through pipeline, then create a send port that drops the file back out, also using the pass-through pipeline and a subscription to the receive port. This works because everything passes through BizTalk Server as a byte stream, therefore allowing any binary object to move through unmolested by parsers. This scenario works great if you are moving docs between SharePoint libraries, or yanking off inbound email attachments (BizTalk Server 2006 only) and throwing them to a file share.

The case I'm demonstrating here is the scenario when you don't want to just pass the non-XML file through the Message Bus, but also apply some business process via Orchestration. The first step was to draw out the Orchestration process and create the necessary Messages and Variables. The goal of my orchestration is to take in a PDF file, look at the context information, and based on the inbound file name, route it to one of three locations. If it's any sort of training manual I move it to one folder, if it's a company report I drop it to another location, and if it's anything else, it rests in one final spot.

So my process looks like this:

You'll notice a few things. First of all, the message type of the inbound document is of System.Xml.XmlDocument. This message type doesn't actually require XML content. Rather, it's treated within BizTalk Server as a grab bag of any file format. Because a message traveling through an orchestration isn't automatically loaded up into the DOM but rather remains a stream (unless you have Distinguished Fields which then cause selected parts of the data to be loaded into memory), there's no problem with accepting anything into that Message. Try it out, it's neat stuff. Of course remember that I haven't shown anything that lets you get at the CONTENT of that message, as you only have access to the context properties unless you have helper components that can rip open the message.

You may assume that I'm using regular expressions to parse out the receive file name. You are correct. So in each decision shape, I use the static IsMatch member of the RegEx object to look for a key phrase in the file name. See below:

After building all this out, we deploy it. When creating the ports, remember to keep the pipelines as pass-through (unless you write a custom pipeline component that is used to add key data to the message context on the way in). You see my active configuration below:

So there you go. While BizTalk Server is keenly optimized to take advantage of XML formats, we've also enabled you to pass everything but the kitchen sink through the messaging and process engines. All the more reason you can use BizTalk Server as a hub for all the message-based traffic hurtling through your enterprise.

Comments

  • Anonymous
    November 12, 2005
    Is there any size limit when we pass files like .pdf or .gif files in biztalk ???

    Can you send me ur answer to krithiga_srinivasan@yahoo.com

  • Anonymous
    November 14, 2005
    There's a physical size limit imposed by the server. But theoretically, you can pass 1GB+ files through BizTalk Server. However, there's almost no case where that is a good idea. A good rule of thumb (again depending on the server hardware) is to keep messages under 5MB.

  • Anonymous
    December 05, 2005
    The comment has been removed

  • Anonymous
    December 07, 2005
    The comment has been removed

  • Anonymous
    January 04, 2006
    I would like to know if you can create a pdf-file based on an xml-file for instance with biztalk, or how to do it? Thanks.

  • Anonymous
    January 04, 2006
    The comment has been removed

  • Anonymous
    February 19, 2006
    Hi, I tried to implement the idea without success. Is it possible to send me the project in order to compare it with what I have done.
    Thanks in advance. My email is salam@altern.org
    Thanks in advance

  • Anonymous
    February 21, 2006
    I sorted it out. I was using back slash instead of forward slash in the Port_2(Microsoft.XLANGs.BaseTypes.Address) So first, I used,
    ----------
    @"file://C:TutorialProcessingAnyThingInBTSOutTrainingtraining.pdf", then I switched to

    @"file://C:/Tutorial/ProcessingAnyThingInBTS/OutTraining/training.pdf"

    It works like a charm
    Again nice idea and well done

  • Anonymous
    February 21, 2006
    Excellent.  Glad you got it. That syntax is so picky!

  • Anonymous
    March 03, 2006
    Great blog!   I got everything working except I'm trying to use an expression to change the FILE.ReceivedFileName property.  

    When I try to say:

      FILE.ReceivedFileName = "test.txt";

    I get the error - "Cannot implicitly convert type "System.String" to "FILE.ReceivedFileName" .  I'm wondering if this is because the message is type XmlDocument and not XLANGMessage.

    Any suggestions?

  • Anonymous
    March 03, 2006
    You actually can't change that property.  You'd have to use a dynamic port to set the outbound file name.

  • Anonymous
    March 20, 2006
    Any chance I can get the project file so I can learn from it. I am a beginner BizTalk developer. Thanks

  • Anonymous
    March 21, 2006
    Hey Edgar,
    I accidently paved over the project when rebuilding a virtual machine, but the steps I outlined here are fairly easy to reproduce.  Try setting it up yourself, and post any questions you have.

  • Anonymous
    April 19, 2006
    Hello Salam Y, ELIAS,
    i am new to BizTalk Server and also in the org where i am working, no one has worked on biztalk server before. i will be oblidged if u send me the sample code for the scenario being discussed here. thanx

  • Anonymous
    April 24, 2006
    Hey there,  as posted 1 comment above, I actually don't have the physical bits anymore, but I fairly accurately showed all the parts you need to built it in this post.  If you have any questions while setting it up, feel free to post.

  • Anonymous
    April 25, 2006
    Great article.  How do I keep the same filename when processing the file?

  • Anonymous
    April 25, 2006
    What I would like to do is use this process as a file copy to deploy to a web site.  I need the files names to stay the same as they were when they come in.  I don't see how to keep the file names the same.  Am I missing something in the Set Port Address Expression?

  • Anonymous
    April 25, 2006
    The comment has been removed

  • Anonymous
    April 25, 2006
    Thanks for your help.  The macro works great.

  • Anonymous
    May 22, 2006
    I am very new to BTS and need to 'get' all xml attachments from any given email for processing. For now I am just trying to get them and drop them to a designated folder. I see various examples but nothing that gives me enough information or direction.

    Can someone help me explain the steps. I can get an email and get "an" attachment of a particular index. I need to get all xml attachments - the number of files(attachments) will be unknown.

    Thanks,
    Phil

  • Anonymous
    June 23, 2006


    Just a bit over a year since I started BizTalk blogging, so I thought I'd take 5 minutes and review...

  • Anonymous
    August 15, 2006
    good stuff !

  • Anonymous
    December 11, 2006
    I am having one http receive port which receive the name of the image need to be moved from in folder to out folder. In folder have n number of images but the http request says which need to be moved. How can I achive this. Pls Help.... Also please reply to rabi.sahu@gmail.com

  • Anonymous
    December 19, 2006
    The comment has been removed