Update on Open XML’s ISO progress

I wanted to provide a bit of an update on how things are going in TC45 as we look through the various National Body comments that came in with the Open XML ballot. The ballot resolution meeting (BRM) is going to be the last week of February, and on January 14th the editor of the spec is tasked with pulling together responses to all the issues that were raised. There are 3522 comments in total, but when you group them into similar buckets it narrows down pretty quickly into a more manageable list… but still pretty impressive!

In TC45, we've been hard at work helping to sort through all the issues and come up with good resolutions based on all the feedback. It's a lot of work, but it's really progressing well. There were some really good suggestions, and I think we'll see that this round of review will result in an even better spec than we had at the end of last year.

Just this weekend, we posted the first collection of responses to the 3,522 issues. https://www.ecma-international.org/news/TC45_current_work/First%20group%20of%20662%20proposed%20dispositions%20of%20comments%20posted.htm

There are currently 662 responses, and the plan is to provide updates to this list every few weeks. We still have almost 2 months until the deadline, but given that we have a lot of issues to work though, we thought it would be best to provide the responses earlier than the Jan deadline to allow for more time to discuss the issues.

One thing I was really hoping we'd get to do was provide a public view of the progress, but ISO rules are that the national bodies comments and responses should be kept private, only to be viewed by the other national bodies. I know there have already been some public postings of the comments, but since we want to follow the ISO rules access of the actual responses and the list of original comments will have to be restricted to just the National Bodies. I would think that at some point the access information will get out though, and maybe at that time folks will decide to just open it up and allow everyone to view the progress (I hope).

So far I think we're doing a pretty good job of doing what the national body is asking for. Most of the comments were accompanied by a proposed resolution, and most of them are great suggestions, so our response back is often that we'll do exactly what they are asking for.

There is still a long way to go though leading up into the meeting in Geneva. It's been fun to get back into the swing of things with the other TC45 members though. We had a bit of a break last spring, but have been hard at work since the comments started pouring in over the summer.

-Brian

OpenXMLCommunity.org Quote of the Day:

Alcuadrado S.A. – Colombia

"In today's world, there is a variety of standards for each technology. And in the document, spreadsheet and presentation physical storage formats it's the same with ODF, HTML, PDF and Open XML. We consider that's convenient that we could choose which one depending on the task at hand. OPEN XML should become and ISO standard as a very complete, open and documented standard."

- Andres Fontan – Chief Architect

Comments

  • Anonymous
    November 19, 2007
    I guess IBM will have started its team of of scrutinizers to disect the responses and then I would expect Rob to launch a series of critical articles probably starting close to 14 januari and then up to the BRM. A timeframe in which issues he mentiones cannot be dealt with in Ecma responses before the BRM meeting. To bad that the responses are not publicly available because being a very interested observer i would have liked to see them. But Brian even though you might not be able to share the Ecma responses you might comment on my guessing some of them ?

  • Borderart => moved to annex and in a vector graphics format ?

  • VML => moved to annex ?

  • Legacy compatibility item => either more info on them, moved to annex, substituted by generic solution

  • Spreadsheet dates => added support for ISO (subset) dates, made date_1900 format deprecated ?

  • Bitmask items => identified relation to ISO standard 14496-22 and/or Panose, changed values to decimal or other appropriate values.

  • Alle examples made to validate using XML schema's ?

  • Spreadsheet bugs as mentioned in http://blogs.msdn.com/brian_jones/archive/2007/07/12/spreadsheet-formula-bugs.aspx all corrected ? ...

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    Hi Brian, are ECMA responses really already published? I see only the following files on ECMA site: 000 txt No Title 0 004 zip No Title 1385 003 xls Combined Comments on ISO/IEC DIS 29500 from all MBs 1719 002 pdf ISO commenting template - Electronic balloting application 225 001 pdf DIS 29500 Project Editor's Report: 2007-10-01 488 None of them contains responses to NB's comments. Jirka

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    >> I wish ODF had undergone such scrutiny. They >> didn't even have a BRM to address the comments >> that folks had raised. Come off it, Brian.  You know full well why ODF had no BRM: because it didn't need one!  BRMs are not required for standards that pass the ISO vote unanimously; 23-0 in ODF's case.  The comments that were generated for ODF - a fraction of your "impressive" count for OOXML - all came from "folks" that had voted Approve With Comments. You can pretend that OOXML's rockier passage through ISO is down to IBM's blocking tactics all you want, but you can't hide the truth.  OOXML is simply too broken to become an ISO standard.  The kind of fixing that you're attempting now should have been done by Ecma before submission for ISO Fastrack. And if you seriously do manage to fix all the comments that were raised by the ISO process, then you'll end up with a spec that will be very different to what the one you first submitted.  And to where can you point for reference implementations for this new beast? Cheers,

  • Mike
  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    Mike, that's not a row of a table in the Open XML example you gave.  I'm not sure what went wrong here, but something that starts with a <w:t> tag isn't a row of a table, as anyone working with Open XML knows. If you'd like to post the actual markup of what you're talking about (or better yet, links to two documents), I'd be glad to discuss the details, but in this case you've posted two different portions of these documents.  They don't even have the same text in them, so it's pretty hard to say anything meaningful about how they compare.

  • Anonymous
    November 20, 2007
    Mike, A quick look at the spec will tell you that rsidR is an optional property applications may place onto a run of text that allows them to label unique points in time when an edit was made. This allows for document that get forked to be easily merged together in the future (you can tell is something was added to one as opposed to deleted from the other one). This funcitonality does not exist in ODF, so that's why you don't see it in the files. The value w:eastAsia="ar-SA" is using the ISO definition for languages to specify what the language of that text is. Your document has the language directly applied to that text, so it gets saved out into the file that way. Also, if the user decides to use character and paragraph styles, rather than direct formatting, you'll have the styling information seperated out from the content. In your example you had some direct formatting, so that shows up directly on the run. In order to parse the ODF file, you'll also need to look at the character properties defined towards the top of the file, so you should take that into account. If you look at part 4 of the Ecma spec, it gives you a very detailed reference of every element and attribute. You can quickly find the tags you aren't sure about and you'll see a description. -Brian

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 20, 2007
    Mike where did you save from i.e. ms office2007 for the ooxml file and open office for the odf file;or did you use open office with a plugin to save the ooxml file? So what I'm basically asking is the output from a Office generated source or not.

  • Anonymous
    November 20, 2007
    @Ricus It was either OpenOffice 2.3 or Lotus Symphony Beta 2 to save the ODF file. For the OOXML file, it was MS Word 2003, with the OOXML add-on pack that I downloaded (from Microsoft, I think). Cheers,

  • Mike
  • Anonymous
    November 20, 2007
    @Mike: Uh, why are you looking at the XML of the document, anyway?  Don't you have a good editor for it, like Office, or iWork, or Abiword?  OOXML comes with a lot of legacy, a lot of features, and a design that's geared towards expressiveness and performance.  It is clearly not designed to be edited by hand in any major way.

  • Anonymous
    November 20, 2007
    The comment has been removed

  • Anonymous
    November 21, 2007
    Mike, OpenXML allows you to put character formatting directly on a run, or you can define a style and reference it that way. In ODF, you must declare a style for the character formatting (even if the style may be somewhat "fake"). So the file you are looking at has some direct  formatting, and the  application you chose (Word 2003) decided to write that direct formatting onto the run rather than using a style. That's an application decision though, and it could have writting the content out in a very similar way to the ODF file. With ODF, if you have a run of text that you want to apply formatting to, you have to first create a style in a seperate location of the XML markup, and then reference that style from the run of text. So you should also take a look a the properties for style "P96" and you'll probably see similar values for kerning and language (unless ODF doesn't support that). -Brian

  • Anonymous
    November 21, 2007
    The comment has been removed

  • Anonymous
    November 21, 2007
    The comment has been removed

  • Anonymous
    November 21, 2007
    Forget I said that. It is the extraction tool we use that tries to reformat but actually ruines the xml layout. The original office 2007 files just seem to be longs data strings of xml.

  • Anonymous
    November 21, 2007
    The comment has been removed

  • Anonymous
    November 22, 2007
    Brian Jones, Office Program Manager, beschreibt den Fortschritt in seinem Blog und warum es derzeit so

  • Anonymous
    November 22, 2007
    Brian Jones, Office Program Manager, beschreibt den Fortschritt in seinem Blog und warum es derzeit so

  • Anonymous
    November 24, 2007
    "I would think that at some point the access information will get out though..." I guarantee that some of the info will get leaked somehow, and it will by OOXML opponents that will do the leaking.  Only, they'll leak half-truths accompanied by plenty of spin and FUD.  By keeping the site private, you are merely playing right into IBM's hands.  Open up the site, so that when Rob Weir spouts his FUD, people can check the real info by going to the official site, and see that Rob's words are indeed nothing but FUD.

  • Anonymous
    November 25, 2007
    Bruno, I agree that is a risk, and will most likely play out as you suggest. Ultimately though, we need to stick to the rules. It's up to the ISO folks to decide if an open site is ok. -Brian

  • Anonymous
    November 25, 2007
    Rob is unlikley to comment on items not yet publicised. He is member of the US standards committees and as such should be acting responsibly with confidential info of ISO. Also the period to jan 14 is very short so there is not nescesarily a need for information to be leaked as it provides little advantage to do so. I would expect opponents with access to the spec just to use this period to analyse the responses and try to find item in them that can influence the votes negativly especially if those items are blown up to huge proportions. But we'll see. I would expect the ooxml front to be fairly quit for the next 7 weeks or so...

  • Anonymous
    November 26, 2007
    My guess is that Mike created this document in OOo and then copy-pasted it to Office 2003. If the documnet were created from scratch in Office 2003, it would (a) have styles, and (b) have global language settings. Besides, OOXML is not native Office 2003 file format. Creating and saving the document in Office 2007 would make for a better comparison.

  • Anonymous
    November 26, 2007
    Now that you have to modify the Office XML spec, the documents created by so many users of Office 2007 will by definition be of a different standard. Why not work towards merging the best of ODF and Office XML into a unified standard instead of more sabre-rattling games?

  • Anonymous
    November 26, 2007
    Luke, What sabre-rattling are you talking about? The underlying goals of the two formats have not changed since their creation. The core goal of compatibility with the existing base of binary documents still exists for Open XML, and was not part of ODF. There is work though in the standards world to understand the differences between the two formats. If folks want to work on merging the two formats, the results of these comparrison efforts would obviously be very useful. -Brian

  • Anonymous
    November 26, 2007
    BRM meeting convenor Alex Brown on his blog stated that Ecma identified 1030 distinct comments. How would the number of current responses stack when measured to those 1030 distinct comments ?

  • Anonymous
    November 27, 2007
    hAl, I don't remember exactly. I think we were around 250 or so. -Brian

  • Anonymous
    November 28, 2007
    Brian, As some of the out-cries from the anti-OOXML-lobby is getting louder by the minute on the secrecy of the disposition of comments - don't you think it would create a bit of breathing-room if ISO/IEC made a statement that they have asked ECMA to keep the dispositions available to NBs only? Maybe a statement from Rex could do it, since he is the ISO/IEC-appointed editor of the dispositions. As you might have noticed in my discussion with Rob, it very quickly gets very "JTC1-directive-technical", so a statement would allow us (Rob and I) to focus on other, more important stuff than this ... i.e. the comments themselves. :o)

  • Anonymous
    November 30, 2007
    Brian, I second the request of Jesper. We have been discussing in Charles-H. Schulz blog (http://standardsandfreedom.net/) the confidentiality (or not) of working documents as stated in the ISO/IEC rules, and my understanding is that confidentiality is not imposed by the rules (but Jesper disagrees with me :-). To better understand the situation, could you let us know who exactly from ISO/IEC made the request, and the reason given for it ?  Thanks.

  • Anonymous
    November 30, 2007
    Brian, How does "The key design goal of the Open XML format was that it coudl (sic) faithfully represent the existing binary documents from Microsoft Office. There are billions of those documents out there, and now they can be moved into a format that is an open standard without any negative impact on the owners of those documents. Once those documents are moved into this open format..."   answer the question-> "How does (MS)Open-XML becoming an ISO standard offer choices for accessing billions of legacy documents?" Are those billions of legacy documents inaccessible now? Is Office 2007 incapable of faithful representation of the legacy documents until MSO-XML is ratified by ISO? Having a single source for the MS-legacy-to-MSO-XML converter -is- a negative impact, especially when the original application is not available to check the faithfullness or the desire is to move from MS-legacy to ODF. It certainly doesn't seem as if it offers more than one choice, at least it doesn't offer not any more than are currently available. "There are already a large number of applications that support the open xml format..." There are a few applications that support fragments of the MSO-XML format and one that might support most.

  • Anonymous
    November 30, 2007
    The comment has been removed

  • Anonymous
    December 02, 2007
    when you say "Open XML" what you mean? Open Office XML? Office Open XML? XML that is open ? ( is there any closed XML? ) This is so confusing.

  • Anonymous
    December 02, 2007
    Brian, Why always an answer to a question not asked? What would make me happy is to see answers that are on-topic, complete, and accurate. I ask how ISO approval leads to accessing billions of documents and you answer that accessing documents was a goal of the MSO-XML effort. The answer is orthogonal to the question.

  • Anonymous
    December 02, 2007
    @Dave S You actually ask a different question now. First you asked about choice and not you have more limited your question to accesibility. To answer your current question. The binary formats will slowly disappear in the future. New applications will be created that use the new XML functionalitiy for instance to retrieve data from archives. These won't nescesarily work on old files. Depending on your use a need for conversion of your files to a new format can arrive. With Office Open XML companies get the ability to faithfully convert their current binary formats (now or even in a distant future) to an XML based format in such a way that no information is lost. So it allows acces to ALL the information in the binary files even when converted to the new format. This is not 100% possible with for instance ODF. The lack of ability to faithfully convert the current Office document base to ODF was a main reason why the opendocument foundation abandond the ODF format development. So mayby your ealier question on choice in accesibility you should direct to OASIS and ask them why they never chose compatibility with current Office documents as one of their goals.

  • Anonymous
    December 03, 2007
    Having looked at both specs, and the XML of both -because XPath and XSL is why you'd bother with XML in the first place- I'm somewhat underwhelmed by ODF and appalled by how awful OOXML content is. It has a look that radiates 'work in progress transition from OLE based content to XML'. At the same time, the two products are fairly close to convergence in feature sets. Why isn't ISO pushing back on both groups to say: come up with a format that is a proper superset of both, a decent format with the stricter format model of ODF, and normative test documents?