Java and Office documents via XML, XSL-FO
Couple of interesting items related to Interop with Java and Office docs.
1st, MSDN article on saving MS-Word into XSL-FO. Oleg commented on it very favorably. In essence, you can use MS-Word as an XSL-FO designer. That's a huge step forward.
2nd, an old, but new to me, article on using Java to generate WordProcessingML. I picked this up from John R. Durant, which I found while surfing my new favorite XML-related page... which is...
3rd, TopXml's blog aggregator. High signal-to-noise ratio there.
So. what does it all mean? Well, item 2 stands on its own. That's a nice capability. In any Java app, you can generate a document that conforms to the published XML schema for MS-Office docs, produce Office docs (reports, memos, whatever), and then ship them via a webservice to a client, where they can be consumed - printed, viewed, whatever. In this scenario, there is no use of Office on the server side. It's Just XML, so it could be done on any any modern platform. [ Do mainframes speak XML? Can I write a CICS TP that generates an XML document? Hmm, I don't think I would want to do that. . . ]
2ndly, combining item 1 and 2 means that, if I for some reason don't want to use WordML, I could run the output through the RenderX XSL sheet mentioned in item 1, and generate an XSL-FO doc.
There is a license for the WordProcessingML stuff, but it is available free of charge. I don't know the license for the RenderX stylesheet but it is available for download at the MSDN article in question. Cool possibilities. . .
Ok, sure you could have been using Apache FOP as well, but ... it is really a pain to design XSL-FO docs manually, or programmatically starting from nothing. This combo allows you to use Word as the visual forms designer during development, then at runtime, use any XML-aware platform (like Java) to fill in blanks in the XML template foc, and transform to XSL-FO. This is a big step forward.
Comments
Anonymous
March 29, 2005
Is it really possible to use any Java application to dynamically generate MS-Word files, complete with graphics, tables, text styles, fonts, and more? Yes, quite possible. And Would you believe? - it's easy too!Anonymous
January 17, 2007
In the past I've posted some articles [ 1 , 2 ] about generating Office 2003 documents from a server-sideAnonymous
July 17, 2007
Is it really possible to use any Java application to dynamically generate MS-Word files, complete with graphics, tables, text styles, fonts, and more? Yes, quite possible. And Would you believe? - it's easy too!Anonymous
August 28, 2008
Firstly great article. Is there any way i can verify that this is a MS word document(.docx), what would be the name of the .xml file i am to look into to get this information(i assume this information would be a schema element defined in one of the .xml files).I need to verify it is in the correct format. Thanks TapanAnonymous
September 02, 2008
Tapan - there are two different formats being discussed here. First, the .docx file format is a zip file, with a particular, well-defined internal structure. The .xml file I spoke of in this posting is different - it is the WordML format which pre-dated .docx by at least 2 years. This particular post talks about how to format an .xml WordML document. This particular post does not talk about how to produce a .docx from Java. That is also possible, and is something I considered writing some example code for, it is not something covered here. Now, your question has to do with querying and validating a .docx file, which is another thing entirely. I'd suggest you look elsewhere for that. There is a System.Packaging namespace in the .NET base class library as of .NET 3.0 - it will help you if you are using .NET. If you are using Java, you will have to roll your own, I think.Anonymous
January 26, 2010
Hi, I want to know how can convert WordXML to XSL FO by my own XSLT script? I mean is there any parser which can convert WordXML to XSL FO using custom XSLT ?
- Pavan
- Anonymous
March 24, 2010
I would like to know if one can import existing XSL-FO stylesheets into Microsoft Word and do the design changes without affecting the data set binding??