More information on the Open XML translator and some questions answered

There were a lot of great comments from last week's announcement about the creation of an open source project to transform between the Ecma Office Open XML formats and the OASIS OpenDocument format. Rather than respond to all the comments and questions directly, I thought it would be better just to write up another post to address the general themes people have raised.

Here are the main questions:

  1. Will the translator only work with Office 2007?
  2. Aren't there licensing differences that make ODF and Open XML incompatible?
  3. Will the functionality be easy to find in the UI?
  4. Doesn't this move contradict what you've been saying about Office not supporting ODF?
  5. Will the Ecma Office Open XML formats still be the default in Office 2007?
  6. Why don't you join OASIS and help improve where they are lacking?

What versions of Office will this work with?

Well, first you should all remember that we are making the new Open XML formats backward compatible and providing free updates to Office 2000, XP, and 2003 which will allow all three of those versions to consume and generate files in the Open XML format. The new tool that is now an open project up on sourceforge will convert from the Open XML format into ODF, which means that you can use this tool in combination with the free updates to read and write ODF in all those earlier versions of Office as well.

Aren't there licensing differences between ODF and Open XML?

Actually, this misunderstanding is the unfortunate result of a really strong push by folks who I don't believe quite understand the Open XML story. There are a handful of folks who blog a lot (primarily ODF supporters) who aren't up to speed on the latest policies around the Open XML formats.

Let's address this first misunderstanding. The formats are available without any licensing restrictions. Any IP (patents, etc.) that Microsoft may have behind the formats does not apply to folks who want to implement the formats, because Microsoft made a legal commitment to not enforce that IP. If you hear people complaining about licensing issues, they probably just aren't up to date.

Secondly, the formats are no longer owned by Microsoft, they are owned by Ecma international. They are fully documented and the spec is free to download. A large number of organizations (British Library; Apple; Novell; Microsoft; BP; Intel; etc.) have worked on ensuring that the documentation allows for cross platform implementation.

Will the functionality be easy to find in the UI?

Look for yourself, here's a screenshot of the current prototype:

It's directly exposed in the UI. We're even going to make it really easy to initially discover the download. We already need to do this for XPS and PDF, so we'll also do it for ODF. There will be a menu item directly on the file menu that takes to you a site where you can download different interoperability formats (like PDF, XPS, and now ODF).

Heck, if you wanted to be even more hardcore, the Office object model allows you to capture the save event. So if you wanted to you could make it so that anytime you hit save you always used the ODF format, just by capturing the save event and overriding it. I'm not expecting folks to do that, but it does show just how extensible Office really is.

Doesn't this move contradict what you've been saying about Office not supporting ODF?

I've been pretty clear that I thought third parties would come along and build ODF support into Office if there was interest. That was shown to be the case pretty early on, as there have been a couple different projects announced over the past year. Ironically, one of the most high profile projects was announced by the OpenDocument Foundation but it has turned out to be pretty secretive and closed, which seems to go against all the goals of "openness". I've had folks ask me how they can get a hold to it, but as far as I can tell only a select group of folks have been given access. I saw a quote saying they still hadn't decided if they wanted to charge for it or not, so that may still be holding things up.

With all the mystery around projects like that, we had a number of governments ask us to get involved and actually choose a project to back, as they wanted to know that if any of their constituents used the ODF format, they would be able to view those files.

I think this project is a great example of the openness of both of these formats. We are now going to have an open source implementation that everyone can use. It will of course be freely available to anyone, and will really help show how to use the two architectures of ODF and Open XML.

Will the Ecma Open XML format still be the default for Office 2007?

Yes, this is definitely still the case. While this new translator will help people read and write the ODF format in Office, it will also help make it clear to all why the Open XML format was necessary. The Open XML formats were designed to be 100% backward compatible with the existing set of Office binary formats, and that was really a goal that we can't compromise on. If we went with an XML format that resulted in data loss or poor performance, then the only people that would use it would be folks who actually cared about that specific file format. Since most of our users don't really care about file formats, we needed to create an XML file format that we knew everyone could use, otherwise most people would have just gone back to using the old binary formats, and that doesn't help anyone.

While the ODF format is great in terms of being an open XML format, it's lacking in a number of functional areas that make it not a realistic option for Office to use as a default format. For instance, the format for ODF spreadsheets is much less efficient from Open XML's spreadsheet format. I have a few posts talking about this (and I plan to cover it in greater detail as we move forward):

  1. Design Goals behind SpreadsheetML
  2. Spreadsheet performance - Shared Formulas
  3. Does tag size matter?

There are also a whole host of areas that are left unspecified in the spec (such as spreadsheet formulas), which would have meant we'd either need to extend the format, or wait for it to catch up (and it sounds like they are more than a year out for formulas in particular). There are a number of blog posts out there talking about the incompatibilities between the various applications that have implemented ODF, and a lot of that is due to the lack of clarity on some features in the spec. Look at this comment from the OpenDocument Foundation talking about KOffice's ODF support:

"Our tests show that OpenOffice and KOffice have some problems opening each other's OpenDocument files. Also, support for drawings is a bit incomplete."

The Ecma Open XML format is significantly further along in all of these areas, just look at the differences in the documentation of numbering formats, formulas, etc. The draft of the Ecma spec released back in the spring has over 160 pages on spreadsheet formulas; the ODF spec only has 1 page.

I don't want to be critical of ODF because I think it's great to see applications use open XML formats for their storage. I'm calling attention to these points because I think a lot of folks have mistakenly assumed that once there is a standardized office format, everything is set and you don't need another one. Unfortunately that's not the case, and I want everyone to understand why we couldn't use ODF as our default format. I have no problem with multiple XML standards for documents and I think this is definitely a case where an alternative is necessary. If a single XML file format were the way to go, then we would have just stopped with XHTML (or maybe DocBook).

Most of our customers actually do understand this, and contrary to the news being spread (primarily by people excited about the ODF format), most governments have not adopted policies around ODF exclusively but instead around open formats in general. Most of those governments have also expressed that once the Office Open XML format is approved by Ecma, it would also be viewed as an open format.

For example, the Belgium government is currently being described as "mandating ODF", but that's actually not the case. They even made a public statement last week after we made the translator announcement that clarified this. Here's a small blurb from that:

"The government’s choice for ODF is clear, but not exclusive." ..."If the OpenXML file format (Microsoft’s own contribution in the domain of open standard file formats) receives ISO approval as a standard, then this format will also be eligible for use in the administration of the Belgian government."

Why don't we join the OASIS technical committee to help them along?

I had a few folks asking this question (and saw it on a few other blogs as well). The standardization of Ecma Office Open XML formats is really moving along well, but there is still a bit more work to do here to nail things down. If you've read through the latest draft (all 4000 pages), you've probably noticed how comprehensive of a spec it really is. For example, there are over 160 pages on how spreadsheet formulas works as compared to 1 page in the ODF spec. The ODF spec still has a lot of catching up to do, and according to this post they are still more than a full year from just getting in line on some of the basics (like formulas) that have existed in office documents for decades.

The Ecma Office Open XML spec on the other hand serves as a great base in terms of fully standardizing an XML format that is capable of representing the billions of Office documents that exist today. Once that's done, we (as a community) can then move forward and start to enhance it with new innovations. It's maintained by Ecma, and anyone can join and participate in the standard.

I think that anyone interested in helping to drive the future of office file formats should join us in Ecma and take advantage of the powerful framework for document formats that is being delivered. As I already pointed out, formulas in spreadsheets for example is already close to being fully documented. The same is the case for all the international features and functionality (like the various numbering styles I'd mentioned before). If you don't have the time to participate directly in the working group, you can instead send direct feedback here: mailto:ecmatc45feedback@ecma-international.org

-Brian

Comments

  • Anonymous
    July 10, 2006
    Why did you guys put the prototype up on SourceForge instead of CodePlex?

  • Anonymous
    July 10, 2006
    "...a lot of that is due to the lack of clarity on some features in the spec. Look at this comment from the OpenDocument Foundation talking about KOffice's ODF support:
    'Our tests show that OpenOffice and KOffice have some problems opening each other's OpenDocument files. Also, support for drawings is a bit incomplete.'"

    You make it sound like this is unusual or even unwanted.

    IE still has some problems displaying HTML4 and CSS2, despite the fact that those specs are over 8 years old.

    With all specs, be it document formats, image formats (e.g. PNG partial transparency support took a while to appear on a number of platforms IIRC), programming languages (C99 now 7 years old, still not fully supported by GCC or MS CC last I looked), it does take some time for interoperability issues to be worked through, for all implementations to agree on what is the correct interpretation on unclear parts of the spec, and for corrections, addendums, defect reports, etc... to be made to the spec itself.

    A /necessary part/ of making sure interoperability /works/ is actually having multiple independent implementations attempt to interoperate, so that these issues can be found, raised and worked through. (Specs are like programs - they're never entirely bug-free.)

    Tell us again, how /are/ the independent implementations of the complete MSOOX spec coming along? How are these issues being found in /those/ 4000+ pages?

  • Anonymous
    July 10, 2006
    You are not serious about that UI, right?!? The right way to do is to just put it into the "normal" open dialog, as a format selection into the "save as" dialog and allow us to pick the default format for saves in the options.

  • Anonymous
    July 10, 2006
    As I blog on extending the Office 2007 UI (http://pschmid.net), let me address the UI issues....
    davidacoder: Office 2007 gives an add-in (whether from MS or not) either the option of taking over the "Save As" dialog completely (which means that the built-in Save As dialog never shows, but only the own you provide) or adding an item somewhere in the Office button menu (the approach chosen by the add-in). As this functionality is not built-into Office, what you are asking for is not possible.
    If someone wants to use ODF only, it is fairly straightforward to replace the existing New document, Open document and Save As dialogs.

  • Anonymous
    July 11, 2006
    PingBack from http://rjdohnert.wordpress.com/2006/07/11/more-information-on-the-open-xml-translator-and-some-questions-answered/

  • Anonymous
    July 11, 2006
    Pet peeve. I run an older laptop on Win98SE. Now the XML adapter program can run on Office XP, which is the last version of Office compatible with 98SE. However, when I downloaded it, it wouldn't install on Win98. I can understand why we shouldn't expect 987 specific upgrades in this regard. But why wouldn't the program work with any valid installation of Office XP including those on Win98 since I suspect not a few of the Ofice XP users are running that os. With the older laptops like this, upgrading is not an option and not all of us can just run out and buy a new laptop on Microsoft command.

  • Anonymous
    July 11, 2006
    Well, then MS should enable plug ins to do that. If they are going to ship two themselves, it would really make a lot of sense. Also, could the ODF item at least appear on the sub-menu of the save as thing? Making it a top level thing in the office menu ist just terrible and will be an incredible bad example for other add ins. From that point on every add in will add a major entry to the main office menu and will always be able to point to MS, since they did that with their add in as well...

  • Anonymous
    July 11, 2006
    Patrick: if what you say is true, then how is it that:
    1. installing the Office 2007 Beta suddenly gives me a raft of new open and save options in Office 2003?
    2. adding/removing import/export converters in Office 2003 also changes the file type list in these dialogs?

  • Anonymous
    July 11, 2006
    davidacoder: You could put it into the Save As list. But where would you put the open ODF file then?

    Francis: Regarding 1. Did you install the Awareness Update? That should modify the list to include the new file formats (OpenXML). Keep in mind though that the Awareness Update is a patch to Office 2003 and not an add-in.
    2. Good question. Maybe if you write it as file converter instead as an add-in, you can do this. But written as an add-in, you'll have to go the RibbonX route which doesn't give you access to this list. Let me see if I can't find some documentation on this issue.

  • Anonymous
    July 11, 2006
    You haven't really answered the question about joining the TC, using it instead to digress into promoting OXML.

    In the previous post, you asked a series of questions about interoperability to the blogosphere. My point was that it makes much more sense to address these concerns and questions directly with the ODF TC.

    On another point, I really think you ought to avoid the constant bigger-is-better comments and reference to incompatabilities between ODF applications. On the first point, one reason ODF is signficantly smaller is because it heavily-reuses existing standards.

    On the latter, OXML has yet to have a single shipping application, much less multiple. When you ship and another vendor implements full OXML support and you have seamless interoperability, then you can make a reasonable argument. But until then it's FUD.

    Finally, I'm sure you realize that the standards process is slow, so I'm not really sure it makes much sense to spread disinformation here. The three current subcommittees whose work with go into ODF 1.1 and 1.2 will finish their work in less than a year, and it'll then take a few months to go through the approval process. So what? How long will it take TC45 to finish the spec, get it approved, and then go through ISO?

  • Anonymous
    July 11, 2006
    Patrick, I understand that you don't have much choice here. But at the same time, I believe MS should enable a add-in way to do this in a proper way, in particular when they intend to give official blessing to this project. We agree that from a UI and consistency perspective the right thing to do would be to integrate into the extisting open and save dialogs, and have no extra top level menu item? Then MS should enable that for your project.

  • Anonymous
    July 12, 2006
    Bruce, I don't think I'm spreading FUD at all. Michael Brauer says in that link that the committee draft is still 1 year away and that OASIS will not vote on it until at least October 2007. Then it will most likely take another 6 months to go through ISO.

    I made that statement because from what I can see, both OpenDocument and OpenXML are incomplete specs. You can't possibly think that OpenDocument is a complete spec when it doesn't even cover the basics like formulas and more advanced international features. I haven't claimed that Open XML is complete yet either, but it looks like it's much further along. The current goal for Open XML is to have it an Ecma standard by the end of this year (as opposed to the end of next year for OASIS to vote on ODF).

    And in terms of shipping implementations of either spec... as I've said I'm not claiming that Open XML is complete yet, so I'm not expecting there to be many shipping implementations. We've seen a number of prototypes built by people in the working group (Novell, Toshiba, Esilor).
    The shipping "implementations" of OpenDocument aren't really much better though. Every document I've played with opens differently in the different applications. I even had a document that when saved from KOffice as ODF caused OpenOffice to crash. So I would hardly say those are good examples of shipping products supporting the standard (outside of OpenOffice and those applications that took OpenOffice's file I/O code, the rest are more like prototypes).

    -Brian

  • Anonymous
    July 12, 2006
    The comment has been removed

  • Anonymous
    July 12, 2006
    Brian,
    You said that several people have asked the question "Why don't we join the OASIS technical committee to help them along?" but you don't really give an answer. The nearest you come is "I think that anyone interested in helping to drive the future of office file formats should join us in Ecma...".

    Can you answer the question?

    Simon Jones
    Contributing Editor
    PC Pro Magazine

  • Anonymous
    July 12, 2006
    The comment has been removed

  • Anonymous
    July 13, 2006
    The answers to that question -- why doesn't MS join OASIS and influence the standard so they can use it more effectively within  Office -- seem painfully obvious to me.

    1) Tying themselves to an "open" standard with different objectives for a file format than Microsoft has is foolish.  ODF is designed to be very human-readable, and is designed specifically around OpenOffice.org's formats (not Microsoft's).  The design goals of ODF do not apparently always match those of Microsoft.  Should Microsoft try to subvert ODF so that it fits their goals more closely, or should they develop their own standard and allow users to choose which one is more effective for their use?

    2) Some of the more powerful members of OASIS vis-a-vis ODF are not exactly friendly with Microsoft, and develop competing products.  Why would Microsoft want to subject themselves to the huge contention that would almost doubtless arise in such a situation as vendors struggle within the committee to control the format to their own benefit (or the detriment of their competitors)?

    As for why they didn't join OASIS and help out, even if they had no plans to depend on the standard, what makes you think Microsoft is particularly interested in helping their competitors?  ODF is, in reality, to the benefit of Microsoft's competition.

    My personal feeling is that competing document format standards, if they are open, are a Good Thing.  That way, everyone has the opportunity to choose the format (or formats) that best suit their needs.

    But hey, I'm just a guy who uses Office.  I'm no expert.

  • Anonymous
    July 13, 2006
    Thanks Andrew, I think you really nailed a couple of points there.

    Bruce, Wouter, Simon,
    I'm sorry if it appeared like I didn't answer the question. I thought I had actually made it clear why we weren't participating directly in the OASIS committee, but let me try to clear it up.

    We ultimately need to prioritize our standardization efforts, and as the Ecma Office Open XML spec is clearly further along in meeting the goal of full interoperability with the existing set of billions of Office documents, that is where our focus is. The Ecma spec is only a few months away from completion, while the OASIS committee has stated they believe they have at least another year before they are even able to define spreadsheet formulas. If the OASIS Open Document committee is having trouble meeting the goal of compatibility with the existing set of Office documents, then they should be able to leverage the work done by Ecma as the draft released back in the spring is already very detailed and the final draft should be published later this year.

    To be clear, we have taken a ‘hands off’ approach to the OASIS technical committees because:  a) we have our hands full finishing a great product (Office 2007) and contributing to Ecma TC45, and b) we do not want in any way to be perceived as slowing down or working against ODF.  We have made this clear during the ISO consideration process as well.  The ODF and Open XML projects have legitimate differences of architecture, customer requirements and purpose.  This Translator project and others will prove that the formats can coexist with a certain tolerance, despite the differences and gaps.

    No matter how well-intentioned our involvement might be with ODF, it would be perceived to be self-serving or detrimental to ODF and might come from a different perception of requirements.   We have nothing against the different ODF committees’ work, but just recognize that our presence and input would tend to be misinterpreted and an inefficient use of valuable resources.  The Translator project we feel is a good productive ‘middle ground’ for practical interoperability concerns to be worked out in a transparent way for everyone, rather than attempting to swing one technical approach and set of customer requirements over to the other.

    -Brian

  • Anonymous
    July 13, 2006
    Brian,

    Thanks for the clarification. I think you addressed the question squarely there.

    Simon Jones
    Contributing Editor
    PC Pro Magazine

  • Anonymous
    July 13, 2006
    Thanks! This really helps to see it from your point of view.

    Ok so the short answer would be: The effort to bring Office 2007 and Odf together would have been to great and it probably would have resulted in a PR nightmare.

    It seems your answers is more mature and objective than before. I can accept this answer better than something like "Odf is bad" because I really don't think Odf is bad. It's not perfect and it doesn't have perfect implementations but I believe the direction where Odf is headed is better.

    Odf is a more procedural, generic, simple and vendor-neutral format than OpenXml. OpenXml represents the structure of office 2007 to much. Which from a development point of view is of course very logical ánd very efficient. You have a consistent object-model, grafical-model and document-model which all fit together nicely. And that is pretty handy if you want Office 2007 and not Office Forever ;).

    Some questions:
    - How much did the object-model for Office 2007 change if you disregard all the new features?
    - Is the object-model for Office 2007 similar to the structure of OpenXml and if not what are the differences?
    - Will Microsoft try to bring Odf and OpenXml together in the future or will this never be goal?
    - Is it posible to incorporate a Microsoft converter which converts Office binaries to OpenXml into an Open Source Application?
    - How does Odf affect your work? are you happy with its existence?
    - Can we expect more Microsoft standards to open up and XMLize?
    - Do you understand the scepticism agains OpenXml?


    And keep up the good work!





  • Anonymous
    July 14, 2006
    "Ok so the short answer would be: The effort to bring Office 2007 and Odf together would have been to great and it probably would have resulted in a PR nightmare."

    Not to mention the extra contention would likely slow down the ODF process tremendously.

  • Anonymous
    July 14, 2006
    Stephen McGibbon has an interesting blog post based on his observations of the politics behind standardization....

  • Anonymous
    July 14, 2006
    PingBack from http://microsoft.wagalulu.com/2006/07/14/politics-behind-standardization/

  • Anonymous
    July 16, 2006
    Como se sabe XML en un formato independiente de la plataforma y del lenguaje, y la adopción que vienen surgiendo desde...

  • Anonymous
    July 16, 2006
    Como se sabe XML en un formato independiente de la plataforma y del lenguaje, y la adopción que vienen surgiendo desde...

  • Anonymous
    July 19, 2006
    I looked into the question of whether it would be possible to show the ODF formats in the Save As dialog file format list. The information is straight from the Office 2007 beta team, so I am quite confident that there are no other options.
    1. The SaveAs dialog cannot be customized
    2. The only option to provide a SaveAs dialog with the ODF formats in it is to replace the built-in dialog with a custom created one (using RibbonX e.g.). This is not an option for this add-in. Why? Read my style guide on RibbonX customizations: http://pschmid.net/blog/2006/06/09/20
    3. Writing this as import/export converter is not an option either. The only program for which a custom filter can be written is MS Word. However, such a filter would convert between the external format and RTF, which definitely misses the point for the ODF translator project. For more info on this, see http://support.microsoft.com/kb/111716/en-us
    4. The only viable solution is therefore to provide the formats as items in the Office button menu. The current implementation as its own menu item is in my opinion the best approach. Adding ODF to the Save As flyout would bring up the issue of where to place the Open ODF command. By having open and save combined under one menu item, users can find and use both commands easily.
    5. If anyone wanted to use ODF as the default format, it is fairly straightforward to change the RibbonX code to have the ODF open & save features be the ones called when using the built-in open & save. The built-in ones could then be moved to a menu item similar to how ODF is currently implemented. This is probably a change of a max of 10 lines of code.
    6. As I have stated before, the ODF translator is an Office 2007 add-in and therefore limited to whatever is available to all other Office 2007 add-ins. Microsoft isn't going to provide any special MS-only hooks into Office 2007 just to integrate the ODF, PDF & XPS things into it. MS-only hooks would just be a waste of development resources and non-MS add-in developers would want to use them as well (which would be simple due to the open source nature of the ODF translator project). Hence, this is either going to happen through a normal Office 2007 add-in mechanism, or not at all.
    7. I personally highly doubt that customization of the SaveAs dialog will appear in Office 2007 and I am not in favor of implementing it either. There are many other aspects of UI customization that should be addressed long before doing any work in the SaveAs dialog area.
    I hope this addresses all UI related issues of the ODF translator add-in.

    Patrick

  • Anonymous
    July 19, 2006
    The comment has been removed

  • Anonymous
    July 19, 2006
    The comment has been removed

  • Anonymous
    July 19, 2006
    The comment has been removed

  • Anonymous
    July 20, 2006
    The comment has been removed

  • Anonymous
    July 20, 2006
    The comment has been removed

  • Anonymous
    July 20, 2006
    Hi Steve, that analysis really seems to be a bit premature (to say the least).

    The project has just started, and I'm not suprised that it's easy to identify problem areas. I suppose it could have been worked on more in private before it was posted up on source forge, but the whole point was to give great transparency to the project at an early stage. This way people can directly participate in the development.

    Let's actually hold off on judgement as to the quality of the output until it's further along. But if you see any areas that need work, feel free to post them to the site (or you could even work on it directly)!

    -Brian

  • Anonymous
    July 25, 2006
    I think what everybody needs to understand is that Microsoft has an extremely bad (and deservedly bad at that) reputation for interoperability. Their attention to vendor lock-in and FUD is legendary. Only the IBM of the past even comes close. If you can't admit that, then you're drinking too much of the cool-aid.

    Until such time that they actually prove repeatedly that they are serious about interoperability, nobody is going to give them the benefit of the doubt. End of story. This obvious UI decision seems to signify that they aren't going to do that soon.

    And for those of you saying that this is alpha software, and that it will change due to customer feedback, what's the obvious choice from a UI designer's standpoint? To a) have every plugin that saves another format have its own fly out  menu in the main file menu, or b) to just add it to the save as list? If you seriously answered a, then I think you need some lessons in UI design!

  • Anonymous
    July 25, 2006
    PingBack from http://pschmid.net/blog/1969/12/31/32

  • Anonymous
    August 28, 2006
    The new ODF to Open XML Translator Project has been getting a lot of attention lately. This is a collaborative...

  • Anonymous
    February 06, 2007
    Como se sabe XML es un formato independiente de la plataforma y del lenguaje, y la adopción que vienen

  • Anonymous
    May 31, 2009
    PingBack from http://indoorgrillsrecipes.info/story.php?id=507

  • Anonymous
    June 08, 2009
    PingBack from http://insomniacuresite.info/story.php?id=8149

  • Anonymous
    June 18, 2009
    PingBack from http://fancyporchswing.info/story.php?id=3095