Jaa


Working with ODF in Word 2007 SP2

For those of us on the Office Interoperability team, as well as our colleagues throughout Office, today is a big day.  We’ve released SP2 (Service Pack 2 for Office 2007), which includes a bunch of updated features.  Gray Knowlton has a roundup of what’s new in SP2, but I think the feature of most interest to readers here is probably the built-support for ODF 1.1.

I first mentioned our plans for ODF support in a blog post last year, and I’ve also blogged in the past about the guiding principles that we followed in our ODF implementation.  Our decision to support ODF is just one aspect of Office's broad commitment to choice and interoperability, as covered by Tom Robertson today on the Microsoft on the Issues blog.

For today’s post, I thought I’d put together a hands-on example of a typical user experience when working with ODF and Office 2007 SP2.  I’m going to focus on a typical document creation and editing scenario in Word.  Specifically, I’ll go through these steps:

  • Create a typical document in Word 2007 SP2, and save it as ODF.
  • Open that document in OpenOffice 3.0.
  • Back in Word, add some fancy styling and other typical enhancements to the document, then save the fancier version in ODF.
  • Open that fancier version in OpenOffice.

The starting point.   As a first step, I’ll create a document we can use as a starting point to try out some things.  So I select File/New in Word, add some text, insert a few of the things we all use regularly in documents (a title, headings of various levels, a numbered list, and a table), and do some simple formatting.  Here's how it looks:

image

The next step is to save this as an ODT document.  That’s pretty simple – – just click the Office Button,  move your mouse to ‘Save As”,  and then select “OpenDocument Text” from the menu.  Before I go any further, it’s worth noting a couple of things about this step:

  • You can make ODF the default document format if you’d like, and then you won’t need to select it from the dropdown list each time
  • I’ll get a message warning me that my document may contain features that aren’t compatible with this format, because ODF can’t represent 100% of the things we can do in Word

Now I’ll open this document in OpenOffice version 3.0.1.  In a future post I’ll look at differences between various existing ODF implementations, but for today’s post I’m just going to stick to OpenOffice 3.0.1 and Office 2007 SP2.

When I open my ODT document in OpenOffice Writer, here’s what it looks like:

image

As you can see, the document looks essentially the same in both applications.  The page break is the only obvious difference – it occurs at a different point in the document due to differences between the default line-spacing values used in Word and OpenOffice.  Other than that detail, the document looks the same in both applications, with the same fonts, formatting, headings and content.

The line-spacing variation is something you can see in other ODT documents and other ODF implementations as well.  For example, if you open the latest draft of the ODF 1.2 specification (OpenDocument-v1.2-cd01-rev06.odt) in IBM Lotus Symphony 1.2.0, it is 931 pages long, but if you open the same document in OpenOffice Writer 3.0.1, it’s 875 pages long.  These types of variations demonstrate a fundamental difference between a fixed-layout format (such as PDF or XPS) and a flow-oriented layout like ODF or Open XML.  Flow-oriented formats work well for dynamic editing activities, whereas fixed-layout formats rigidly pin down the layout of a document so that it will be rendered exactly the same on different devices.  For these reasons, most people prefer to use a flow-oriented format during document authoring and editing, and a fixed-layout format for published documents that are no longer being edited.

Getting Fancier.   Now let’s move on to some fancier formatting and see how that works.  I’m going to open this document in Word and make a variety of changes:

  • I’ll switch to a different styleset, which will alter all of the styles in the document; I’ll choose the “Modern” styleset from Word’s built-in options
  • I’ll Insert an image into the body of the document, with square text-wrapping around it
  • I’ll apply a table style to the table; I’ll use one with header-row and first-column formatting turned on, as well as row and column banding
  • I’ll insert a header and a footer, using Word’s “Annual” style for header and footers
  • I’ll insert a table of contents, using the default settings

As a result of these changes, my document now looks like this in Word:

image

And if I save that version as an ODT file and open it in OpenOffice, I see this:

image

You’ll notice that many things are identical in both Word and OpenOffice, and a few things look a little different in each application.  Here are some things that are the same in both applications:

  • All of the content is the same – nothing is missing in either application
  • All of the title/header/text styling is the same
  • The table styling is the same
  • The header and footer look the same
  • If you were to try clicking on the links in the table of contents, you’d find that these work the same in both applications (i.e., clicking on an entry takes you to that part of the document)

And here are some things that appear differently in the two applications:

  • The formatting of the hyperlinks in the Table of Contents is different, due to differences in Word and OpenOffice’s default styling for hyperlinks
  • The document is a little longer in OpenOffice than in Word, due to issues like the default line-spacing issue mentioned above
  • The text-wrap margins around the inserted image also differ slightly, again due to differences in application defaults

If you’d like to test these sample documents yourself, they’re in a ZIP file attached to this blog post (below).

Getting more information. This demonstration was just a simple example, for those who are curious about how the new built-in ODF support works in Office.  You can find more detailed information about SP2’s support for ODF 1.1, including which features are supported by Word, Excel and PowerPoint, at these links:

Going forward, I’ll be doing some blog posts that get down into more of the technical details, to help explain some of the engineering decisions that we made in our implementation.  For example, tracked changes functionality is of interest to many users, so I’m working on a post to cover why we decided to not implement tracked changes in ODF.

What else would you like to understand about our implementation of ODF?  Share your questions and thoughts in the comment thread, or email me (dmahugh at microsoft dot com) if you have suggestions for topics you’d like to see covered here.  I’m very proud of the work my colleagues on the Word, Excel and PowerPoint teams have done to add ODF support, and I’m looking forward to discussing the details now that SP2 has been released.

SampleDocs.zip

Comments

  • Anonymous
    April 28, 2009
    PingBack from http://microsoft-sharepoint.simplynetdev.com/working-with-odf-in-word-2007-sp2/

  • Anonymous
    April 28, 2009
    From Microsoft: Today Microsoft is releasing Service Pack 2 for the 2007 Microsoft Office system. This

  • Anonymous
    April 28, 2009
    Wow, great news.  Now we can talk about this as a released implementation of ODF.  Congratulations.

  • Anonymous
    April 28, 2009
    Great news. I'm quite interested in how form fields are handled (haven't checked that one out, but it's tax season so all I can think of right now is filling out those forms ;-) Keep up the good work.

  • Anonymous
    April 28, 2009
    Congratulations to this great step towards interoperability.

  • Anonymous
    April 28, 2009
    Office 2007 SP2 includes major performance enhancements for Office applications and servers, most notably

  • Anonymous
    April 28, 2009
    Se poate descărca de pe Microsoft Update . Cele mai importante goodies din acest SP, după părerea mea

  • Anonymous
    April 29, 2009
    The Office 2007 SP2 is available now for download: http://www.microsoft.com/downloads/details.aspx?familyid=B444BF18-79EA-46C6-8A81-9DB49B4AB6E5&displaylang=en

  • Anonymous
    April 29, 2009
    The comment has been removed

  • Anonymous
    April 29, 2009
    Can you test using the test documents that are available from Oasis?  This would show up any holes in the produced XML. Maybe then we'll see it as a complete implementation. Also, what can Office do that ODF cannot store? Thanks.

  • Anonymous
    April 30, 2009
    The comment has been removed

  • Anonymous
    May 01, 2009
    The comment has been removed

  • Anonymous
    May 03, 2009
    Thank you for this great feature! I hope Office will stay compatible with future versions of ODF too!

  • Anonymous
    May 04, 2009
    Congrats on the filter - for Word. However, since SP2 implements only ODF 1.1 (since ODF 1.2 is still only an advanced draft format), how are formulas stored in spreadsheets? I hear there's also a problem with tables in slideshows (which is strange, since obviously Word can do ODF tables; why can't Powerpoint?) I also wonder about page styles: I'd like to see how a document that alternates page formats, filler blank pages and such work in both. How are master documents handled? @Ian: .docx is a proprietary XML-based format that has a single implementation. OOo developers are having trouble developing an import filter for the following reasons:

  • actual file format doesn't always conform to the published specification (encryption had to be reverse engineered, for example).
  • there are several redundant features: tables in Word, Excel and Powerpoint are different objects that share 95% of their properties (tables in OOo/ODF are the same, as no difference is made between one document and another) which all require a different import filter method to create a single object: a table
  • some features don't match with OOo's internal structure (geometrical shapes and text: OOo has 2 renderers, a simple one and Writer. The simple renderer is used for these shapes, but OOXML requires a richer one)
  • Office 2007/2008 is the only Office generation using .docx; Office 14 should use OXML, and OOo developers think that their time would be better spent developing an import filter that can manage most XML formats at once (better for support, reduces code redundancy) Upcoming version 3.1 will solve several problems here, and there are already further improvements planned/started for 3.2.
  • Anonymous
    May 04, 2009
    Doug, I really wish you could issue a response to the slashdot article. it brings up very interesting points worth answering.

  • Anonymous
    May 04, 2009
    PHPPowerPoint 0.1.0 was released last week, as an open-source PHP API for generating PPTX files, much

  • Anonymous
    May 04, 2009
    Mitch, there is no “filter” involved – it’s built-in support.  What made  you think that it’s a filter?  (I can’t find any place I’ve ever used that word regarding our ODF support, but would be glad to correct it if I have.) Also, your claim that docx is a proprietary format is hard for me to understand.  There are many implementers who have written code to generate DOCX files by working directly with the ECMA376 spec – in what sense are the resulting DOCX files proprietary? Regarding your questions:

  • as we read the specification, tables in presentations are not allowed in ODF 1.1 – that was added in ODF 1.2, which is not yet an approved or published standard
  • the tables issue was pretty thoroughly debated a couple years ago during the DIS29500 process; Open XML has three table models, each optimized for a particular document type, and ODF uses a single table model across all document types
  • we store formulas in our own namespace; this is the  only option available in any of the  published versions of ODF.  I will be writing about this in more detail in another post later this week.
  • the encryption approach used by our implementation of Open XML is documented at http://msdn.microsoft.com/en-us/library/cc313071.aspx, and code samples are available at http://offcrypto.codeplex.com/ Your other comments seem to be more about OO’s plans than Office’s implementation, so I can’t add to those.
  • Anonymous
    May 05, 2009
    Rob Weir posted on his blog a couple of days ago an Update on ODF Spreadsheet Interoperability . 

  • Anonymous
    May 07, 2009
    @dmahugh: reference to Office 97 installer: "additional file format filters" if you think of a better shortcut term, please tell me :) ECMA376 relies upon but doesn't describe formats such as VML although they are declared 'deprecated', relies upon but doesn't describe paper sizes internal to MS Office (non-compliant with ISO paper sizes), relies upon non-standard leap years and date formats. Were it really open, it would have been accepted as-is by ISO, instead of being strongly edited (1,000 modifications required for 6,000 pages, published 8 months after it 'became a standard' instead of 6 weeks). And stop me if I'm wrong, but currently no Office version complies with ISO 29500:2008. About the rest of my comment, it was directed at Ian. About tables: yes, it was debated for DIS29500. However, I fail to see how ODF 1.1 can't accept tables in a presentation, as there are no differences between ODT, ODS, and ODP apart from the last letter: their XML manifests and contents are identical, so like a text document can include a table, so does a presentation - Impress couldn't add a table to an ODF 1.1 presentation but Kpresenter could, and so can OO.o 3, even when setting the ODF compliance to 1.0/1.1. ODF 1.1 thus supports tables in presentation documents. There is NO reason Powerpoint would scrap a table added to an ODF presentation, since the currently standardized format accepts it, except to artificially limit the export filter. It's not because an (now outdated) application didn't support that particular feature that it can't be done.

  • Anonymous
    May 11, 2009
    The comment has been removed

  • Anonymous
    May 13, 2009
    When I blogged about the release of SP2 with ODF support two weeks ago, I mentioned that I was planning