Converting WordprocessingML into HTML (for easy viewing)
Many people have asked me if there is an easy way to go from Word XML into XHTML.
I've already mentioned that we have a tool available that transforms from Word 2003 XML into HTML. You can download it here: https://www.microsoft.com/downloads/details.aspx?FamilyID=19676b18-1bcd-4852-93ba-0b5a203ea731&displaylang=en
This is actually a pretty cool tool. There are a number of ways you can extend it. By default it will add a behavior to IE so that any XML file that has the Word PI (processing instruction) will automatically have an XSLT applied that converts it into HTML that can then be rendered by the browser. It also does work to store the embedded images to a temp location so they can be referenced by the resulting HTML.
You can also write your own XSLTs and register them for the viewer to use. Then when you open a Word XML file, you will have the choice of XSLTs to apply. This is just handled with the schema library (the same way you can register XSLTs for Word to apply when it opens your XML).
Additionally, if you want to register some views for your own XML that you want applied in the browser, you can put a PI in the files that identifies it as yours and register with the tool that you want it to also render your files instead of the default XSLT that IE will apply.
There are other folks out there who are also building tools on top of Word XML. Here's a blog I was just pointed at this morning where Oleg has worked on modifying an earlier version of the XSLT that we had released: https://www.tkachenko.com/blog/archives/000195.html
If anyone else has a tool they've built on top of WordprocessingML or SpreadsheetML please send me the links. I'd love to take a look.
-Brian
Comments
Anonymous
September 30, 2005
Sorry for sidetracking the thread. I found this ^^^ article ^^^ linked from SlashDot. What do you think?Anonymous
September 30, 2005
No problem, although I would like to keep this thread more focused on questions around the XSLTs...
I've actually seen a number of discussions over the past year or two around the need for formula support in OpenDocument. I've actually stayed away from commenting directly on the OpenDocument schema. I think the use of XML for a document format is great, and I don't want to take anything away from what they've done.
I did recently start questioning the licenses a bit, but that was because I was curious to compare their license with ours. I've had some people comment on the two being nearly identical and others have said they are dramatically different; so I just wanted to take a look.
From reading the article, it sounds like the thought was that they would standardize around the presentation aspect of the formats only. It's a bit unfortunate since the result of a formula does affect the ultimate display. In fact, formula results are often the most important part of the spreadsheet.
Did the original StarOffice format have formulas defined in their schema? Did they decide only push some of the schema through OASIS?
If this is an area folks are interested in, let me know. I can post some examples of Excel's schema for formulas...
-BrianAnonymous
September 30, 2005
Using the SharePoint XML web part to render WordML
http://www.wssdemo.com/Pages/WordContent.aspx?menu=Web%20Parts%20-%20WSS
Using the Data View Web Part to render WordML and other information (document library version info etc) similar to a wiki
http://www.wssdemo.com/wikiAnonymous
September 30, 2005
I meant this >>> http://blogs.msdn.com/utility/Redirect.aspx?U=http%3a%2f%2fsoftware.newsforge.com%2farticle.pl%3fsid%3d05%2f09%2f09%2f192250%26from%3drss <<< articleAnonymous
September 30, 2005
SlashDotJunkie - Same comments apply. I was able to get to the article because you had mapped the URL to your username.
Ian - Thanks for the links! I'm going to start to pull together a collection of links. I used to have a bunch of them but can't find them now, so I have to start over. :-)Anonymous
September 30, 2005
The comment has been removedAnonymous
October 01, 2005
The comment has been removedAnonymous
October 03, 2005
Sadly, my developer moves are not agile enough to be timely with this post: I am still working on my utility that converts WordProcessingML into XHTML using VSTO 1.x. Please see "Dr. Peter Sefton of The University of Southern Queensland calls Brian Jones of Microsoft “Glib”" here:
http://www.kintespace.com/rasxlog/?p=198
Please excuse the title of the Blog post. I have been told that, instead of humour, insult comes out...Anonymous
February 28, 2006
you excuse to me, are one student Italian of the facontà of ingegeneria of the university of the Calabria, I have read your message on: http://blogs.msdn.com/brian_jones/archive/2005/09/30/475794.aspx
I would want sapre if it exists eventually and if you could supply it to me, a xslT that from WordML it translate the document in HTML
Thanks giugrillo@libero.itAnonymous
March 03, 2006
Giuseppe, if you download this viewer http://www.microsoft.com/downloads/details.aspx?FamilyID=19676b18-1bcd-4852-93ba-0b5a203ea731&displaylang=en
you'll see that an XSLT also comes along that goes from WordML to HTML.Anonymous
April 06, 2006
hi i need a code using whih i can extract dat from doc to xml file Can anyone give me solution??
Thaknx
mail me on plz:)
meetesh.mishra@gmail.comAnonymous
November 16, 2006
The sharepoint team recently posted an article up on OpenXMLdeveloper.org on how they allow you to convertAnonymous
June 15, 2009
PingBack from http://unemploymentofficeresource.info/story.php?id=16363