ODF Implementation Notes for Office 2007 SP2

Microsoft has today published our first set of document-format implementation notes, for the ODF implementation in Office 2007 SP2. These notes, which are available on the DII web site, provide detailed information about the design decisions that went into our implementation of ODF 1.1. See the press release for more information.

The implementation notes can be found under the "OASIS ODF 1.1" option on the Reference dropdown on the DII home page. The site is structured to allow for multiple standards and implementations to be covered, and in the coming weeks you'll find an option on that dropdown for ECMA-376 as well, which will cover how that standard is implemented in Office 2007. We'll also release implementation notes for IS29500 when we get closer to the release of the next version of Office, code named Office "14." Other implementers of these standards are welcome and encouraged to post their own implementation notes to help achieve a level of interoperability that will benefit users around the world. Assistance is available to those who are interested.

Each Implementation is Unique

Every implementer of a large standard such as ODF or ECMA-376 needs to make decisions about how to approach implementation of the standard. Application limitations and application design come into play, as well as more subtle factors such as support for optional constructs, default values for missing attributes, and bugs. The cumulative effect of all of these factors can cause behavior that was never intended, or behavior that can be difficult to understand in the abstract without detailed information about the myriad details that make up each implementation.

A simple and instructive example can be seen by opening a document in multiple implementations of the same standard, and then checking the total page size of the document. If you download the OASIS ODF 1.1 specification, which is publised as an ODT file, and open that file each of the leading implementations of the ODF standard, you'll find different page counts in each of those applications.

Is this a problem? Is one implementation "right" about the number of pages in the ODF 1.1 spec, and the other implementations "wrong" about the page count? Not at all. Rather, the page size is a symptom of underlying differences in the architecture and approach of those implementations.

Open Standards + Transparency + Collaboration

So how can vendors work towards maximum interoperability between their implementations? The answer rests on three guiding principles: adherence to open standards, transparency of implementation, and open dialog between all interested parties, including implementers, users, and the standards community.

In the case of ODF 1.1, we've been working on this three-pronged approach to enabling interoperability for some time now. We've been participating in the standards maintenance process, and we're contributing what we've learned through our own implementation of ODF, while listening and learning from the perspectives of other implementers. We've engaged with other implementers through DII workshops, 1-on-1 discussions, the standards maintenance process, and other activities. And we're beginning today the rollout of ultra-transparent documentation about exactly how we've implemented key document format standards in Office.

Implementation Notes: Examples

Let's take a look at a few examples of the information that is provided in the ODF implementation notes. Before we get started, it's worth noting a few basic concepts:

  • The implementation notes for ODF, as well as the upcoming implementation notes for ECMA-376 and IS29500, will all be hosted on the DII web site.
  • The structure of the notes, as shown in the example below, matches the structure of the ODF spec itself. If you know your way around the ODF spec, you'll know your way around the implementation notes.
  • Each note includes the relevant section of the ODF spec, with a button for Implementation Notes related to that section.
  • There is a Forums button at the top right of each note, and this leads to the MSDN interop forum dedicated to support of these implementation notes. There you can get answers to any further questions you have about Office's ODF implementation. (In the near future the Forums will move to a blog-style threaded conversation for each implementation note.)

As an example of how the implementation notes work, here's what you'd see if you were to drill down into section 8.3.4 (Shapes) and then click on the Implementation Notes link:

sample implementation note

Note the small capital N next to some of the sections in the treeview control on the left. Those indicate that you'll find information about Office's implementation under that section. Here are a couple of other things to note about the example above:

  • The notes refer to "core Excel 2007," meaning Excel itself, as opposed to Excel's support of graphics, rich text, or other constructs that may be shared across more than one application.
  • There is a reference to additional information in section 9.2. This is common: many of the notes refer to other notes for more information or clarification, since the implementation of multiple sections may be closely related in various ways.
  • Note in the taskpane to the left that there is no note for the "1 Introduction" section, because that section is empty. We've made the treeview exactly match the structure of the ODF specification, and in some cases there is no content between a higher-level heading and a lower-level heading; this is one of those cases.

Here's another example of an implementation note, from section 8.2.1 (Column Description):

As you can see, this tells you that Word is limited to 63 columns in a table. This is an application limitation of Word, and by documenting this limitation we are allowing other implementers to accurately predict and understand Word's behavior.

Similarly, the implementation note for section 15.2.1 (Page Size) explains that Word has an inherent application limitation regarding page size:

The standard allows for much larger page sizes, which is a good thing. Future improvements in display or printer technology may make much larger page sizes feasible, and the standard should not limit that type of innovation. But Word has a specific constraint for page size, and that constraint is documented here so that other implementers can interoperate with it as they see fit.

Here's an example of a more complicated implementation note, from section 15.4.23 (Text Formatting Properties - Language):

As you can see, this topic is more complex: how to map Word's two language IDs for a text run (primary and secondary) to the three language IDs supported by ODF (for latin, Asian, and complex text). We're documenting exactly what Word does in this case, so that other implementers can make informed decisions about how to interoperate with Word's implementation.

Conclusions

This level of transparency is something that all implementers will benefit from, and we're hopeful that other implementers will also share these kinds of details about their design decisions. It's useful to understand the guiding principles that each implementer uses to make such decisions, but implementers also need to understand the specific details. With this information in hand, an implementer can make informed decisions about how to provide users with the best possible interoperability experience.

Thanks to everyone who worked so hard in recent months to write, edit, review, organize and publish these implementer notes. The Office PM team has once again done a huge piece of work to help improve interoperability, and people from many other teams within Microsoft have also contributed to this effort. Great work, everyone, and I hope you all enjoy a much-deserved holiday break soon!

Comments

  • Anonymous
    December 16, 2008
    have just been published on the Document Interop Initiative (DII) site :- Welcome to the Microsoft Office

  • Anonymous
    December 16, 2008
    Do you have some numbers at hand?

  • How many implementation notes in total?
  • How many of them are descriptive (document how MS Office creates compliance)
  • How many of them contain notes on non-compliance? Can we download the implementation notes in some form? For offline use mainly, to allow for reference when one is not in the Matrix ;-) Jan
  • Anonymous
    December 16, 2008
    Today, Microsoft published our first set of document-format implementation notes for the ODF implementation

  • Anonymous
    December 16, 2008
    This is really impressive, and a good answer to all the naysayers! Looking forward to that SP2 now.

  • Anonymous
    December 16, 2008
    Jan, we've not tried to characterize the quantity or type of notes because that can be debatable in many cases.  As for an offline copy, we don't currently have plans for that.  The web-site structure allows for threaded conversations on each post (which will go live soon), as well as additional notes for other standards/implementations, and we feel that will be most useful to implementers going forward.

  • Anonymous
    December 16, 2008
    "Is one implementation "right" about the number of pages in the ODF 1.1 spec, and the other implementations "wrong" about the page count? Not at all." Well, I think ANY Document File Format correctly implemented not giving the same page count with different computers, software products or printer drivers has a deficit in the specification! Those deficits are the main reason why PDF got so successful! Just because naturally everybody wants his document at the recipient to look the same way he sees it...

  • Anonymous
    December 16, 2008
    dmahugh: Thanks for the info. When will Office 2007 SP2  hit the market, BTW? I know many people are eager to try the ODF implementation ASAP ... ;-)

  • Anonymous
    December 17, 2008
    The comment has been removed

  • Anonymous
    December 17, 2008
    Stefan, I agree that precise rendering is a key benefit of fixed layout formats such as PDF and XPS.  Flow-oriented formats like ODF, OOXML and HTML usually don't specify rendering/layout details, however, so that's up to each implementation to decide.  (This is why specs are typically published in a fixed layout format, such as the PDFs used by ISO.)

  • Anonymous
    December 17, 2008
    Jan, Office 2007 SP2 will be out in the first half of 2009. Carlos, I don't think there's any particular reason, it just hasn't been a top priority.  We're working on getting the ECMA-376 notes out next.  If there's specific info you're trying to find, let me know.

  • Anonymous
    December 17, 2008
    Doug, The list of elements and attributes "not supported in core Word/Excel/PowerPoint 2007" is quite long. Can you tell us what will happen, when Office 2007 encouters an unsupported element. Will it simply be ignored? When roundtripping - will it be deleted or preserved?

  • Anonymous
    December 17, 2008
    Jesper, On load, Office 2007 SP2 will simply ignore the unsupported elements and attributes in ODF files.  We do not attempt to round trip unsupported elements and attributes, they will be removed from the ODF file if you resave it using Office 2007 SP2.  This is consistent with our implementation principles and our desire to provide predictable behavior.   We considered trying to roundtrip elements and attributes that we do not understand or support, but we found if we did this that we could not be sure the resulting files were internally consistent and conformant ODF files.   As an aside, there are some cases where we write elements or attributes on save that we do not support on load, for the sake of better interoperability with other applications that use ODF.   Those cases are described in the implementer notes as well.

  • Anonymous
    December 19, 2008
    Dernier post Open XML sur mon blog avant les fêtes de fin d’année, avec les liens qui ont marqués cette

  • Anonymous
    January 14, 2009
    Interoperability Challenges I've started testing interoperability between various document-format implementations,

  • Anonymous
    January 16, 2009
    Today we've published another set of document-format implementation notes, this time for the ECMA-376

  • Anonymous
    January 16, 2009
    El día de hoy muy temprano me desperté y leí sobre este post que acababa de escribir mi buen amigo Doug

  • Anonymous
    January 21, 2009
    Hi, my name is Jas Sandhu and I am an evangelist on the Microsoft Interoperability Strategy Team. I manage

  • Anonymous
    May 05, 2009
    Rob Weir posted on his blog a couple of days ago an Update on ODF Spreadsheet Interoperability . 

  • Anonymous
    June 05, 2009
    There has been quite a bit of discussion lately in the blogosphere about various approaches to document