Is it jetlag?

I know Rob Weir has been traveling a lot these days lobbying against Open XML across the world, so when I saw this post yesterday I assumed it must be jetlag. I think he completely misunderstood some of the responses from Ecma related to the issue of harmonization, and has missed some significant developments in this area over the past year. I already posted earlier about my thoughts around harmonization, and work that is already under way in the German Standards Body (DIN) to help guide the way. As I said previously, it appears OASIS is already discussing with DIN taking a more direct role in the Working Group, as indicated by this discussion between Florian from Novell and Rob Weir.

Here is what Rob had to say though which had me confused:

Ecma rejected every single one of these requests. Ironically, their response was that harmonization was not necessary because there exist tools that will translate between OOXML and ODF. However, since these conversion tools are restricted in their fidelity because of the lack of these very features, Microsoft's argument is rather weak.

On the question of harmonization, we are either moving toward it, or we are moving away. The Ecma response does not move us toward harmonization, but starts down the road toward further divergence.

But if you actually read the Ecma response, you'll see that TC45's position is actually quite the opposite. Harmonization is not as simple as just adding a few tags here and there. It's going to be a lot of hard work, and the German Standard Body (DIN) is already working on the first step, which is to identify the differences. This isn't something to take lightly.

Here is Ecma's full response to this issue (emphasis added):

There are currently several XML-based document formats in use, each designed to address a different set of goals or requirements. These include ISO/IEC IS 26300 (ODF), China's UOF, and ECMA-376 (DIS 29500 – Open XML). All these formats have numerous implementations in multiple tools and multiple platforms (Linux, Windows, Mac OS, hand-held devices).

The Ecma Response Document from the Fast Track 30-Day contradiction phase for DIS29500 addressed the question of harmonization by explaining the differences between the ODF and Open XML formats as follows:

"... one must recognize that creating a single "merged" format to address the user requirements of both ODF and OpenXML is a much more difficult goal—one that is hindered by fundamental obstacles comparable to what one might encounter while merging HTML and ODF or HTML and PDF. This is because of sheer difference of scope, feature and architecture. Ecma believes that one format cannot simultaneously meet the requirements that would come from the merge of the two formats and the stringent requirements of backward compatibility that drive the design of OpenXML.

First, while both formats share the high-level goal, to represent documents, presentations, and spreadsheets in XML, their low-level goals differ fundamentally. OpenXML is designed to represent the existing corpus of documents faithfully, even if that means preserving idiosyncrasies that one might not choose given the luxury of starting from a clean slate. In the ODF design, compatibility with and preservation of existing Office documents were not goals. Each set of goals is valuable; sacrificing either at the expense of the other may not be in the best interest of users.

Second, the resulting differences are not merely variances in scope that could be resolved by adding capabilities to one or the other. They are structural and architectural in nature. Where functionality overlaps, the corresponding elements nonetheless differ in precise meaning, usage, capabilities, options, and interaction with other elements. Even more importantly, the corresponding elements do not exist in isolation, but are components of whole document models, with different rules and constraints for such things as page/slide layout, flow, style inheritance, event processing, relative positioning, calculation order, formula dependencies, chart construction, graphic templates, animations, and so on. The resulting variations are not merely cosmetic. They compound to create qualitative disparities that, although perfectly acceptable for much of the user base, can be significant for organizations that require high fidelity in layout, content, or editability. Differences between the implicit page style model of ODF and the explicit page style model of OpenXML, differences in the models for splitting table cells, differences in the style information associated with spreadsheet cells, and differences in the full formula specification used in spreadsheets are only small examples of the hundreds of explicit design decisions that ensure the information included in the existing formats is represented faithfully in the OpenXML format."

There are many translation tools already in existence that enable interoperability between different formats by providing useful translation capabilities between ODF, Open XML and UOF.

We note that the German national standards body, DIN, has a committee, NIA-01-34 (see https://www.fokus.fraunhofer.de/fokus/fokus/presse/meldungen_fokus/2007/05/DIN-E.pdf), that is preparing a Technical Report on the translation of documents between the IS 26300 and DIS 29500 formats. The members of NIA-01-34 include format experts from a number of countries, working together to define the numerous differences between these formats.

Ecma strongly supports any harmonization effort that enables better sharing of information and allows better translation between the formats in the following way: Ecma believes that the work of the DIN (NIA-01-34) committee is essential to any harmonization effort. The work of DIN (NIA-01-34) will enable the industry at large to understand the detailed differences between the formats. Based on this detailed understanding, the ODF and Open XML formats could be extended in the future in order to enable better sharing of information and allow future translations tools to provide even better translation and interoperability between the formats.

Harmonization would require functional changes to two International Standards and would fall under the JTC 1 procedures for new work within SC 34 and could be done in the future. Such work should not be done in this Fast-Track process and should not impede the adoption of DIS 29500.

 

So, as I said there are many approaches you could take towards harmonization. The key for any effort like this though is to first have a full understanding of the issues (in this case identifying the differences), and then you can start to design the solution. I hope that once Rob is done with his travels and anti-OpenXML lobbying (I hear the latest is a trip out to Asia to meet with some national bodies) he's able to get up to speed on the DIN work and as the head of the ODF technical committee he joins in the work towards a better understanding of harmonization.

-Brian

Comments

  • Anonymous
    February 01, 2008
    Very eloquent post Brian. For those of us that enjoy reading your blog, this once again clears up the fog and confusion that is artificially created around these core issues. Miguel

  • Anonymous
    February 01, 2008
    The comment has been removed

  • Anonymous
    February 01, 2008
    Miguel, Thank you. :-)


Fiery Sprited, I don't think you get it fully. You seem to be in a camp that believes ODF is all you need and you can just add a few things on in addition. Please look back at the history here. http://blogs.msdn.com/brian_jones/archive/tags/History/default.aspx The two formats were developer in parallel (this is a fact). They had different design goals, and that's why they are fundamentally different. Harmonization is hard to even define at this point let alone to act on. Merely joining the OASIS ODF committee wouldn't solve the issue. This is why a standards group has been hard at work the past year trying to understand the differences. Once that work is complete we can go on to the next step of defining what harmonization would mean. -Brian

  • Anonymous
    February 01, 2008
    The comment has been removed

  • Anonymous
    February 01, 2008
    The comment has been removed

  • Anonymous
    February 01, 2008
    The comment has been removed

  • Anonymous
    February 01, 2008
    What does it matter that the formats was developed in parallell? The best possible interoperability would be if the formats are merged. Why invest time in getting a ISO label for a format that your current applications does not truely support if you really mean you want interoperability? Maybe you know something about the legacy office format that sun and the rest of the world does not know, but until we see hard evidence there is little reason to trust your hypotetical speculation about the ODF perhaps being hard to extend to suit office. If we look at hard facts Open Office compability with microsoft legacy formats are a well documented and Open Office use ODF...so you are up for some pretty serious work if you want to prove that ODF is not capable enough. As for it not being enough that you join the Oasis commity you are most certainly correct. Microsoft would of course need to put an effort in their participation. Just like when you designed the ooxml standard draft there is choice about open and closed standards. The truth is that Open Office and Symphony have compability with Microsoft formats as one of the prime goals so why would it be hard to gain support for making ODF more Microsoft friendly?  

  • Anonymous
    February 01, 2008
    @Fiery Spirited, please don't confuse OpenOffice and Symphony the software products with ODF the format.  The usual way OO.o maintains Microsoft Office fidelity is by saving back in the same format that is read.  Going to ODF and then coming back to an Office Format is not done with "pure" ODF, to the extent that it preserves round-trip fidelity with the original Microsoft Office document. On the other hand, they provide some useful ideas for implementers who want to support multiple formats. Meanwhile, I just ran into this post about the problems of harmonizing metadata systems: http://digital-scholarship.org/digitalkoans/2008/01/31/harmonization-of-metadata-standards/ The conclusion (about narrow fields of application) is also the only places where "ontoligies" have been harmonized, and that is still difficult.   We already know how difficult it is to obtain round-trip fidelity (a good test) between natural languages, and we are finding that digital formats, sometimes even trivial ones, are problematic. I am happy that we are in the process of developing some important practical experience in this area.  It matters for the future.

  • Anonymous
    February 01, 2008
    The ODF TC hasn't even been able to finish adding some of ODF's original omissions (e.g., formulas) in the last two years.  So do you really believe that it would be possible for them to add all of the functionality of Open XML to ODF quickly and easily?  In other words, the people involved would move more quickly on that project than they're currently moving on their own goals, which have only produced two incomplete and non-standardized variations in the last two years?  I'm having a real hard time imagining that>

  • Anonymous
    February 01, 2008
    Fiery Spirited writes: "The best possible interoperability would be if the formats are merged." Ok, so what is the first step in accomplishing that?  Well, it is to understand and document the differences in detail.  OOOPS!! That's the step that the DIN committee is doing, and that you are opposed to! Go back to square one.

  • Anonymous
    February 01, 2008
    Why settle with only these two standards when we could merge all the file formats into one, huge entity! World would be much simpler without confusing jpg's, mp3's, odt's, docx's, wmv's etc. Just one file format that all the applications support. Just add few tags and that should do it. ;)

  • Anonymous
    February 01, 2008
    The comment has been removed

  • Anonymous
    February 01, 2008
    I hate it when everybody and his dog comes along and says "Just merge it" as if these structural differences would not exist and if this were a project that could be completed in less than 5 years! But still I dislike your position like this: "is why a standards group has been hard at work the past year trying to understand the differences. Once that work is complete we can go on to the next step of defining what harmonization would mean." So what does this mean? What happens if these differences are known? (And I don't believe that they are not known at the moment!) Are there really ANY plans to invest billions of dollars into yet a third file format that combines ODF and OOXML? I think that it will be clear what the outcome of this group is: The file formats are too differnet to merge, etc. And of course MS will not merge, because then the performance drops down and/or Office's internal data structures have to be completely rebuild or Office has to be rebuild completely which would be its doom. I think that additional features in OOXML would always mean that the Office team has to include new features in Office. e.g. if the page borders are now bitmaps, then office has to add a dialogue to select the bitmap. But Microsoft is always on a very tight set of allowed features, just as if a feature would cost them enourmous amounts of money ;-) I really wonder if Office 2009 will contain many new features (and with features I don't mean that stuff from this century like Sharepoint integration and new XML embedding functionality, but I mean the kind of new features like in 1997: Wordart, new layout possibilities, DTP-stuff like allow text to float around graphics, initials, new page borders, nested tables in Excel and PowerPoint, more than 63 colums of text in word, ...) that are created JUST for the purpose of changing the OOXML spec or for interop with ODF? I'm absolutely sure that in order to resolve these converter-problems, BOTH Open Office AND MS Office have to be extended to get new features. Does Microsoft have a budget to add features to Office that no user actually needs? Like custom page borders? (And it really stinks that Microsoft does not add stuff any more for fun. Office has not too many features, but too less ;-)  ) Sorry for rambling!

  • Anonymous
    February 02, 2008
    The comment has been removed

  • Anonymous
    February 02, 2008
    @jemm - what you are referring to is MS Office, which reads most of these formats and creates new ones to represent them internally.

  • Anonymous
    February 02, 2008
    I love the way that proposals to "harmonize" these two formats usually begin with (and I parraphrase) - “Microsoft, we would like to invite you to abandon the output of seven years of engineering effort that absolutely meets the needs of your customers and invite you to come over here to get involved with this thing that clearly doesn’t...” The conversation then goes on to dismiss a whole range of formats that have much wider adoption than ODF. The work with DIN makes a ton of sense.

  • Anonymous
    February 03, 2008
    The comment has been removed

  • Anonymous
    February 04, 2008
    Christian wrote: "I'm absolutely sure that in order to resolve these converter-problems, BOTH Open Office AND MS Office have to be extended to get new features." Yes.  I saw the list of "missing" features for MS Office somewhere on the web.  It was about a dozen quite small features (e.g., make the number of lines in orphans and windows user-specifiable.)  MS could easily slit these features into the next release of Office, but why should they?  Their customers aren't asking for them! On the other hand, OpenOffice has to add features in two steps:

  • Add or change features to be compliant with ODF 1.0.  (Note: that is not a typo.)  The list is on the web.  I recall that it numbered about 300+ features.
  • Then add features to make it OOXML compliant.  Since it is about 10 years behind OOXML, I couldn't even guess the number of features required, but think in terms of $1B+ of effort. There's no way in the world that OpenOffice would even attempt it. Even if they had the money, they wouldn't do it.  Why?  Despite what OpenOffice people say, it's aimed at a different market segment than MS Office.  It competes with MS Works, not MS Office.
  • Anonymous
    February 04, 2008
    The comment has been removed

  • Anonymous
    February 04, 2008
    I noticed that Rob has yet to address or correct the issues pointed out in this post. Miguel

  • Anonymous
    February 05, 2008
    The comment has been removed

  • Anonymous
    February 05, 2008
    Frederik, It is always appreciated when someone brings a sense of reality into these discussions.  Too often, some commentators say things that show they have little experience in document formats, programming, or standards development. Thanks for the example.  Where is your blog?

  • Anonymous
    February 05, 2008
    Thank you very much, Ian. I'm just so tired of reading the same things repeated over and over again, without anyone checking the reality of all the different claims (both for and against OOXML and ODF). My blog is at http://fenilsen.wordpress.com but it's only in Norwegian for the moment. I'm considering writing in english too. you can find the presentations here: http://fenilsen.files.wordpress.com/2008/02/presentasjon-oo.odt http://fenilsen.files.wordpress.com/2008/02/presentasjon-ls.odt The first link is the one made in OpenOffice.org and the second is Lotus Symphony. You can rename them to .odp but it's not necessary.

  • Anonymous
    February 05, 2008
    "there is no way it would have happened in time for MS to release their product. " You've got to be kidding me, they rushed Office 2007 out the door just to do damage control. What they did was to INTENTIONALLY AVOID supporting ODF natively. It would however been a perfect test case, and a very valid demonstration of open engineering. What else did you expect from Microsoft? I hope you realize the Office cash cow brings more money than the Windows cash cow and that any serious dent in it could severely damage Microsoft health. Don't forget this anytime you suggest "Microsoft good intentions".

  • Anonymous
    April 09, 2008
    I'm heading home from Norway in the morning, but wanted to give a quick update on the progress made over

  • Anonymous
    April 09, 2008
    Alex Brown's post "ISO committee takes full control of OOXML" is the first report I've seen from the