Freigeben über


Creating Math Web Documents using Word 2007

If you use Word 2007 to create a document containing mathematical equations and expressions and save it as a web page, it looks just as good in Internet Explorer as it does in Word 2007! The equations look as though they had been typeset by TeX or in some ways even better. How did this come to pass? Well Word 2007 saves mathematical formulas in an HTML document in two ways: 1) the original OMML contained inside comments, and 2) png’s for programs that don’t know how to distill the OMML out of the comments, or don’t know how to display OMML even if they could retrieve it.

Currently at least Internet Explorer knows nothing about OMML and only knows how to display MathML if a math behavior facility such as Design Science’s MathPlayer is installed. So Internet Explorer bypasses the comments in favor of the png’s and they look really fine unless you zoom them (which makes them get fuzzy) or change the background color (which may make them unreadable).

Needless-to-say, David Carlisle, web/XML/TeX/LaTeX/MathML/etc. whiz, was intrigued by this situation. He figured that with a little help Word 2007 could do “the right thing” and create HTML documents with embedded MathML instead of OMML. He next concluded that the right thing for him to do was to prove it. In the process, he discovered and fixed some bugs in Microsoft’s omml2mml.xsl file, which Word uses to export MathML. You can download David’s handy work from his post on the subject. Qualifies as exceedingly cool!

Comments

  • Anonymous
    April 15, 2007
    It would actually be an excellent idea if Word 2007 saved a MathML representation of the equations it contains in docx files. This would just be in addition to the OMML representation, nothing else would need to change (i.e. all load etc would go via OMML). Why? Because that might actually allow us to use Word's equation editor for scientific work. A huge number of major publishing houses with the leading journals do not accept documents that contain equations created with Word 2007 (have a look here at Science Magazin for example: http://www.sciencemag.org/about/authors/prep/docx.dtl). Quite frankly, Word 2007 just misses that market right now, it can't be used. The situation is worse than it was with Word 2003. I believe you guys need to accept that they use MathML for their workflow. That does not mean you have to use it fully as well, I can very much understand your reasoning on that point. But just include a MathML represenation of every equation in the docx file, so that we can use Word 2007 for scientific work.

  • Anonymous
    April 17, 2007
    David Carlisle shows how to get HTML with embedded MathML from Word 2007. From a typographical point of view, the embedded OMML in Word's docx or HTML formats is actually better than either MathML or TeX. Since it is an XML, it can be converted to MathML using the shipped omml2mml.xsl as David shows (he fixes a couple of bugs in that xsl). So with after market tools anyhow, publishers can use Word 2007's math. Remember also that this is only version 1. As I mentioned in a comment replying to your comment on my Find/Replace post, Rome wasn't built in a day :-)

  • Anonymous
    April 17, 2007
    Yes, but I cannot tell publishers to do that. They simply won't accept a paper if I write it with Word's 2007 equation editor. Quite a number of them do not accept PDFs, they want the original format that was used to write the paper (like TeX or doc).

  • Anonymous
    April 17, 2007
    The comment has been removed

  • Anonymous
    April 17, 2007
    The comment has been removed

  • Anonymous
    April 18, 2007
    The comment has been removed

  • Anonymous
    May 19, 2007
    Reverting to what should go into a Web page to 'represent some non-textual character-based notation'. We are coming to the conclusion that what is needed is a universally recognised tag called something general such as:  techiestuff Which is a container for as many different representations as an application may want to put there.     Browsers can then ignore it all (useful if they will do strange things when exposed to non-standard HTML) or look inside for the stuff they can  use (including the primitive, but unfortunately necessary, bitmap representation). Note that using something from an XML vocabulary such as mathml namespace for this container tag does not work.

  • Anonymous
    June 04, 2007
    The comment has been removed

  • Anonymous
    June 04, 2007
    The comment has been removed

  • Anonymous
    July 02, 2007
    Patricia, I don't know of any problems with Greek letters in math zones. Standard Unicode code points are used. It's true that Word 2007 doesn't handle the recently added bold digammas (U+1D7CA and U+1D7CB), but the other Greek letters should be fine. Could you give me an example of a problem?

  • Anonymous
    October 17, 2007
    When I open documents in WORD2007 written using earlier versions of WORD, and also when I save them as docx files, the symbols are not correctly displayed in many cases.  Sometimes they do appear correctly, however.  The documents I have produced include contributions from other people and always when their equations are embedded they are not correctly displayed in WORD2007, whereas they were in the earlier versions of WORD.  Beta appears as a bicycle and an integral sign as a cocktail glass complete with stick.  When I double click on the embedded object it displays correctly but it doesn't stay that way.  If I save in pdf using the WORD2007 pdf export facility I get the bicycles and cocktail gasses but if I print to pdf using CutePDF Writer the equations are written to pdf correctly. Is the problem a consequence of a font being missing in the main text part of WORD?

  • Anonymous
    October 17, 2007
    I can add something to what is written above.  By highlighting the embedded object and right clicking one has - Equation Object - Convert.  (Convert to Microsoft Equation 3.0) - OK.  This works in many cases but the equation is about a mile wide.  However, it can be condensed sideways as one would an inserted object.  However, some characters just disappear completely, including the minus sign which one would have thought would be present in all fonts.

  • Anonymous
    October 19, 2007
    Further to the above two messages, the problem is now solved.  The problem was that the Symbol font was missing from WORD2007 and I don't know why.  By opening the Font folder in Control Panel the Symbol font just became present in WORD2007.  However, to get the converted equations it was also necessary to open the original doc format documents and allow the equations in these to be converted before saving as docx.

  • Anonymous
    July 13, 2008
    The comment has been removed

  • Anonymous
    December 02, 2008
    Recently, the minus sign can't be displayed using Equation 3.0 in word 2007,  why?