Jaa


ECMA-376 Implementation Notes for Office 2007 SP2

Today we've published another set of document-format implementation notes, this time for the ECMA-376 1st Edition implementation in Office 2007 SP2. As with the ODF 1.1 implementation notes we published in December, the goal of publishing these notes is to help other implementers improve interoperability with Office, by transparently documenting the details of our implementation.

To get to the ECMA-376 implementer notes, go to the DII home page and click on Reference and then select ECMA-376 1st Edition from the dropdown list. You'll then see a treeview control in the panel on the left, which contains the entire structure of the ECMA-376 spec.

You can drill down into any node to see that part of the spec. For example, here's a screenshot of the treeview expanded to see the nodes under Part 4, section 2.4, Tables:

ECMA-376 specification structure on the DII web site

Note the small red "N" next to most of the sub-sections under 2.4. Those markers indicate which sections have implementer notes, and in this example all of the sections have notes except for 2.4.12.

After you navigate down to a specific section, you'll see the full text of that part of the spec. This is handy for browsing the spec itself, and in the top right corner you'll also see these three buttons:

  • View Notes — This takes you to the implementer notes for this section.
  • Forum — This takes you to the forum for this section, where you can post questions and discuss details with other implementers.
  • Subscribe — This is an RSS feed where you can subscribe to updates to the implementer notes for this section.

Let's look at an example of the types of information you'll find in the implementer notes. As readers of this blog know, I'm a big fan of Open XML's support for custom XML markup, so I've chosen Part 4, section 2.13.5.11, which covers the customXmlMoveToRangeStart element in WordprocessingML. This element is at the heart of some fairly complicated interaction between two different concepts: custom XML markup and tracked changes. As it says in the first two paragraphs of section 2.13.5.11:

This element specifies the start of a region within which all custom XML markup was moved to this location in the document and this move was tracked as a revision. The id attribute on this element shall be used to link this element with the corresponding custom XML move destination end marker in the document.

Providing a physical representation of the start and end tags of custom XML markup results in regions which can be inserted and deleted independently, but cannot be encapsulated by a single revision element, since their representation in WordprocessingML is the start or end XML tag for the custom XML markup which it represents. Therefore, the start/end "cross structure" annotation format surrounds the WordprocessingML region to which this move destination applies.

Under the implementer notes for this section, you'll first find some references to other sections of the spec. For example, the first implementer note says "Click here to view additional notes in 2.13.5.20 ins (Inserted Run Content)" and if you follow the link you'll see this note:

implementer note regarding Inserted Run Content

So we're saying that the spec is slightly ambiguous in this area (sort of like the chart-series issue I talked about in my last post), and we're clarifying exactly what we've done in Word 2007's implementation. This helps other implementers understand Word's behavior, and they can use that information to improve interoperability.

There are many cross-links like this between various sections of the spec, because of the many relationships between different elements and attributes in the spec. There are also some full-text notes under the customXmlMoveToRangeStart element, such as these:

customXmlMoveToRangeStart implementer notes

I picked these particular notes because they're a good example of the variety found in the implementer notes. The first note above tells you that Word also applies customXmlMoveToRangeStart to structured document tags (or "content controls"); the second tells you that although the spec allows for overlapping ranges, Word doesn't support that; and the third acknowledges that Word doesn't predictably handle customXML move tracking in some situations involving equations in oMathPara elements.

To document our implementation to this level of detail, we had to carefully read every section of the ECMA-376 specification. In doing so, we found some errors. For example, in Part 4, section 12.3.20, we found that the root namespace for the styles part includes a typo: an extra "s" at the end. In that case, we searched the published ISO/IEC IS29500 spec for the same error and found that it's still there, so we've submitted that to Ecma TC45 as a defect report, which TC45 will submit to SC 34 WG4 to be corrected in the maintenance of IS29500.

Any complex technical specification like Open XML is going to have some typographical errors like that example, as well as substantive errors. This is the case with any document format specification. Similarly, any application implementing these specifications will have implementation-related issues that it needs to identify and work through over time.

We understand that users want to see interoperability between document format implementations in the marketplace and are taking the steps we believe any responsible vendor should take. This includes actively participating in the maintenance of the standards we support, identifying and addressing implementation issues going forward, and working collaboratively with other vendors to improve interop between products over time. People are welcome to point out where we may have issues as part of this overall effort between vendors (and customers). I think in the end our customers — and the broader interoperability community of implementers, users and standards participants — will be better off as a result of the things we're doing.

For a closer look at a specific implementer note that is useful to developers, check out the first post on Stephen Peront's blog: Implementer Notes Just Make Good Sense. Stephen's name may be familiar to those who followed the DIS29500 process closely, because he was a member of INCITS V1, the US technical committee that reviewed the specification. Stephen joined Microsoft just last week, and he's working with me on the Office Interoperability team. He's a great asset to our team, and you're going to see a lot more developer-oriented interoperability content on his blog going forward.

I'd like to thank all of the people here at Microsoft who have worked so hard to roll out these implementer notes. The coolest part of my job is working with so many talented and energetic people, and there are too many who played key roles in this project for me to dare to try to name them all. I'm looking forward to seeing the creative things developers will do with this information.

Comments

  • Anonymous
    January 16, 2009
    I am pretty excited about our release of the ECMA-376 Implementer Notes . These notes provide a wealth

  • Anonymous
    January 16, 2009
    Today, Microsoft released ECMA-376 implementation notes for Office 2007 SP2. These notes are an invaluable

  • Anonymous
    January 16, 2009
    We've published today a new set of Open XML implementation notes. The ECMA-376 Implementation Notes helps

  • Anonymous
    January 16, 2009
    El día de hoy muy temprano me desperté y leí sobre este post que acababa de escribir mi buen amigo Doug

  • Anonymous
    January 19, 2009
    La nouvelle n’est pas vraiment nouvelle puisqu’elle avait été anoncée jeudi dernier (ayant été en déplacement

  • Anonymous
    January 19, 2009
    How can I distinguish incoming Office Open XML files between 1st edition and second edition/ISO versions ?

  • Anonymous
    January 20, 2009
    Ambiguity and errors in the specification are essentially poison for implementors, since the Microsoft Open Specification Promise only covers a third-party implementation "... to the extent that it conforms to a Covered Specification ...".  Conformance becomes a risky and dangerous proposition for third parties, where protection against copyright infringement and/or patent claims vanishes if their interpretation of the specification is wrong.   I believe that Microsoft needs to explicitly extend its promise to clarify this case, as otherwise its promise sounds nice but is of very limited value to third parties.   --rebound

  • Anonymous
    January 20, 2009
    hAl, I understand that Switzerland has submitted an IS29500 defect report regarding that issue, and it will be a topic of discussion at the WG4 face-to-face meeting next week in Okinawa. Rebound, this is an interesting question that I assume applies to many standards implementations.  I’ll ask around and try to understand it more.  Do you happen to know how other tech companies like IBM and Sun address these sorts of issues?

  • Anonymous
    January 20, 2009
    @dmaghugh If not explicitly found in the spec then can you tell us how Office 14 will distinguish the versions as office 14 is already found in alfa version and should already be using  both versions I asume.

  • Anonymous
    January 21, 2009
    Hi, my name is Jas Sandhu and I am an evangelist on the Microsoft Interoperability Strategy Team. I manage

  • Anonymous
    January 21, 2009
    Doug, hAl, I agree with you both that this is quite essential and that we should take care of this as soon as possible. However - as I understand the ISO-process, we (SC34 Working Group 4) will need to create a suggestion for a COR and have it approved by SC34. If this is correct there is sadly a time-span of maybe 60 or 90 days before it can be approved. Maybe it will can be approved by the SC34-plenary in Pragh in end of March? rebound: The /practical/ application of an OSP, ISP or CNS is to ensure that the legal foundation of licensing of whatnot technology has been formally written down. With this implementers will start to dig through the covered specification and interpret it any way they see fit to be able to implement it. If there are ambiguities they will choose whatever "makes them feel good" and then get on with their work. These problems exist in all standards and most implementations of ODF, PDF or OOXML will to some extent be "semi-nonconformant" since interpretation-decisions will have to be made during implementation. The above is not to say that OSPs, ISPs or CNS's are not important and should not be required. I am just, to the best of my knowledge, not aware of a single implementer ever being sued for implementing e.g. ODF in a non-conformant way (or implementing DOC, for that matters)

  • Anonymous
    January 22, 2009
    For an commercial implementer license uncertainty is a massive problem. For someone who implements software for a company obviously not, as his company takes the risk. Microsoft did not show any good will to address concerns raised against their OSP. Contrary to what has been claimed Randz licenses are not offered as they were "not necessary". What matters is what the courts say. You cannot build an investment decision on shaky legal foundations. It is a simple application of Murphy's law to assume that potential legal risks are real.

  • Anonymous
    January 24, 2009
    Thanks for the responses to my query.   I'm taking the liberty of imagining myself as a manager in charge of a team that's implementing ISO29500, looking for the risks attached to that undertaking, and seeing if the available information and legal framework are sufficient to let me manage the risks.   @Doug: Sorry, I haven't researched the similarities or differences between Microsoft's position and other companies' positions regarding legal protections relating to standards conformance.   @Jesper: If Microsoft and the implementor have differences over the interpretation of a portion of the specification, Microsoft's significant legal resources, large and growing IP portfolio, and the specification ambiguity may generate a significant risk for the implementor.   (In general): While thinking about the responses to this issue, I've found it interesting to note that the GPL is a "bottom-up" approach to distributing documents with a disclaimer-based system for managing risks that could be incurred by the act of distribution, whereas the OSP is a "top-down" approach, not proscribing how the software should be written, but supplying risk mitigation through an evaluation of how the resulting object functions.   The functional approach has some interesting twists: Supposing a change by a supplier (e.g. an API change in an OS) results in a third-party's implementation becoming nonconformant: Where does any incurred liability lie?   Finally, another angle: If Microsoft's implementation doesn't conform to the specification, but varies in a way that is covered by a Microsoft-held patent, how can a software manager deal with the risks of this situation?  (This is a variant of the "bug-for-bug" compatibility issue that can sometimes occur when a dominant supplier's product becomes a standard.)     Thanks again for your consideration of these comments.   --rebound

  • Anonymous
    January 25, 2009
    Andre and rebound: I'm not a lawyer so I don't think I can add much to these sorts of theoretical discussions of IP concepts.  I will say that we believe our approach provides appropriate protections for real-world scenarios and is commensurate with what other vendors are doing in this area.  I'm curious, have you guys expressed similar concerns about the approach other vendors are taking?

  • Anonymous
    January 26, 2009
    @Doug, Thanks for the reply.  Underlying my position is basically "software consumer fatigue" -- I'm sick and tired of writing software to reinvent the wheel ever since I started programming in the late 1970s.  I want to be able to think mostly about my client's needs, rather than wrestle with changing interfaces to underlying services.  An example of this is that I've settled on using OpenGL as a portable 2-D graphics interface, as it gives me the flexibility to change hardware platforms and/or operating systems without much hassle.   On the subject of the OSP and ISO29500 conformance versus errors/ambiguity:  Rather than a vague notion of "IP concepts", I'm looking specifically at patents, and in particular wanting to clarify what the OSP means by the phrase "to the extent that it conforms".   @To all: Do you have any suggestions for an alternative place (another blog, perhaps?) where these discussions might be able to move forward?   --rebound

  • Anonymous
    June 05, 2009
    There has been quite a bit of discussion lately in the blogosphere about various approaches to document