Sdílet prostřednictvím


Back from Sapporo - tons of progress in Ecma

Well, it has definitely been a pretty hectic couple weeks, and it's going to take me awhile to get caught up. I was in Boston two weeks ago for TechEd, and in Sapporo last week for Ecma meetings. Both were great trips, but it's nice to be home. The meetings in Sapporo were extremely productive, and you can actually read all about it in the status report filed by Tom Ngo from NextPage (https://www.ecma-international.org/news/TC45_current_work/TC45-2006-50.htm).

Some of the key things I wanted to call attention to:

  1. U.S. Library of Congress joins Ecma TC-45 - This was really great news. We've already benefited significantly with the participation of Adam Farquhar from the British Library, and I'm really excited to have the Library of Congress on board too. Like the British Library, the Library of Congress cares deeply about archival and is particularly interested in the long term accessibility of the formats.
  2. Progress on conformance definition - We've spent a lot of time debating how to best define conformance to allow for good interoperability while at the same time making it super easy to use just portions of the specification. We resolved a number of issues and I think we're really in a good spot here.
  3. Progress on WordprocessingML issues - We've made a lot of progress working on the initial WordprocessingML documentation and are now able to drill into the various issues logged by the various members of the technical committee. I think everyone was excited as we were able to start closing down some of the older issues.
  4. Java WordprocessingML to HTML converter - Toshiba gave us a demo of a WordprocessingML to HTML converter they've written in Java. I always get excited when I see tools built on top of the new formats. It's really one of the biggest differences between the old formats and the new. We'll see a lot more 3rd party solutions that were either not possible, or incredibly difficult with the old binary formats.
  5. Schema visualization - Representatives from BP, StatOil, and Essilor went over some ideas for making the documentation and schemas easier to visualize. There are about 4000 pages of documentation right now, and we really want to figure out ways to make it easier to consume.

It really was a great few days, but I wish I'd had more time to explore the area. I lived in Okinawa, Japan throughout most of Junior High and High School, and this was my first time back since then. I really enjoyed Sapporo. The food was great, and of course you can't beat being that close to the Sapporo brewery. Toshiba was an outstanding host.

-Brian

Comments

  • Anonymous
    June 26, 2006
    The comment has been removed

  • Anonymous
    June 26, 2006
    The comment has been removed

  • Anonymous
    June 26, 2006
    Here's June's update (see also May's and April's) from the Ecma International Technical Committee...

  • Anonymous
    June 27, 2006
    The comment has been removed

  • Anonymous
    June 27, 2006
    Hyperion-

    You've lost the plot. Non-portable features make a format a non-standard. That is, not standardizeable.

    It is the wrong strategy when there is an ISO standard file format in existence which will never have non-portable features.

    On this basis, MSECMAXML can't compete.

  • Anonymous
    June 27, 2006
    Sam, with OpenOffice you can insert a spreadsheet formula into your document yet the format for that formula isn't specified in the ODF spec. If it is not documented, does that make it non portable?

    OpenOffice also allows you to embed OLE objects. So it appears to be the case that OpenOffice allows you to insert undocumented stuff into ODF. I can't understand (after reading your comments) why you would consider these to be portable features.

    I think it's absolutely the right thing to do. It would be ridiculous for OpenOffice to prevent their users to insert formulas just because ODF is lacking. But I think this also shows that if you really believe that Open XML is not portable then you must apply that same logic to ODF.

    Now, if you look at the Ecma Open XML spec, you'll see that rather than taking the ODF approach of just not documenting certain things, we fully document how everything works. This way the documents are completely interoperable. If the end user decides to insert a foreign object into the document, there is not much we can do there. We can document how we store that object, but beyond that it's up to the object. OpenOffice does the same thing, but ODF provides even less information on how it's done.

    -Brian

  • Anonymous
    June 27, 2006
    The comment has been removed

  • Anonymous
    June 27, 2006
    There is a solution to the portable document controversy. Simply have the Office applications inform the user when she uses proprietary features, just like how they warn her when content will be lost (due to saving in older formats, e.g. in the compatability checker.)

    Then the responsibility rests squarely with the user.

  • Anonymous
    June 27, 2006
    Francis, I think that's a good concept but I think it would be too heavy handed. What do you think the average user reaction would be to such a message? I'm pretty sure that it wouldn't be positive. :-)
    We try to avoid throwing warnings to users (and I think we already do it more often than we should).

    I think the concept is good though. I think it would be pretty easy to have a tool that quickly notifies you if there are objects embedded in a document that may not be portable. That way people interested in portability could use the tool to easily verify it.
    That would also have the added benefit of being customizable, and you could even update it as there are changes in the market conditions and what's considered portable.

    -Brian

  • Anonymous
    June 27, 2006
    > OpenOffice also allows you to embed OLE objects.
    > So it appears to be the case that OpenOffice
    > allows you to insert undocumented stuff into ODF.

    Undocumented because Microsoft has a clear interest in locking customers into a proprietary format.

    I'll ask this question a third time: why not just publish the specifications (if they exist) of the proprietary secret binary formats, in the interest of openness and interoperability, for the millions of current and former Word users with their billions of Word documents?

    Sean DALY.

  • Anonymous
    June 27, 2006
    Actually, Brian, the history of Microsoft's interactions with Digital Research provides us with a test-case of how average people react to such a barbed warning.  Microsoft included a warning in MS Windows 3.1 that Microsoft couldn't guarantee that MS Windows 3.1 would run well when the underlying DOS wasn't MS DOS.  DR DOS which up till then had been on a roll, fell off the market precipitately.

    And KJK::Hyperion, I've just been reading some of the comments on various blogs on Microsoft's dropping WinFS - one guy had already started supporting it, based on MS Windows Vista Beta 1, in some stuff he was writing.  I'm afraid I know what I'm talking about.  ActiveX's continued existence is no more a given than my own.

  • Anonymous
    June 28, 2006
    The comment has been removed

  • Anonymous
    June 28, 2006
    Adam,
    you need to get caught up a bit, as that comment is from last year before the work in Ecma even started.
    Download the draft that was released back in early March: http://www.ecma-international.org/news/TC45_current_work/Ecma%20TC45%20OOXML%20Standard%20-%20Draft%201.3.pdf

    Take a look at chapter 15.5 (starts on page 247). There are about 160 pages of content describing the formula syntax and about 360 different functions. You'll notice that there is still a ways to go, but this is already a huge amount of useful content.

    Sean,
    Could you give me the list of binary parts you're concerned about? Is it just ActiveX controls (of which I would guess less the 0.01% of Office documents contain)? Users have an option of embedding them, but it isn't something Office would even automatically do.
    Is there anything else?

    -Brian

  • Anonymous
    June 28, 2006
    Brian: Thanks for that. I hope you'll forgive me for not wading through up to 4000 pages myself just to see if any more progress had been made or not. :)

    "You'll notice that there is still a ways to go, but this is already a huge amount of useful content."

    So ... the spec for ODF spreadsheet formulae is not fully defined yet, but it's not fully documented for MSOOX yet either? Why is this some kind of point for either format over the other?

  • Anonymous
    June 28, 2006
    Adam, no problem, I understand. :-)

    My point with the formula comment is that while the ODF spec already went through ISO, it's far from complete. All specs will of coures evolve over time as innovation occurs, but ODF isn't even caught up with the innovations of today (heck, equations have been around for more than a decade). Many folks out there have been led to believe that it's already a complete spec, and that isn't the case.

    In addition, from what I've seen so far, the work in Ecma on formulas is definitely ahead of the work going on in OASIS.

    -Brian

  • Anonymous
    June 28, 2006
    @Wesley Parish:
    "Actually, Brian, the history of Microsoft's interactions with Digital Research provides us with a test-case of how average people react to such a barbed warning.  Microsoft included a warning in MS Windows 3.1 that Microsoft couldn't guarantee that MS Windows 3.1 would run well when the underlying DOS wasn't MS DOS.  DR DOS which up till then had been on a roll, fell off the market precipitately. "
    ----------------
    Just for clarification, that warning appeared in a Win3.1 beta.  And it was true; that Win3.1 beta had NOT been tested at all on DR DOS.  I guess Microsoft didn't want to waste time chasing down beta bug reports that were actually DR DOS issues rather than Win3.1 issues.

  • Anonymous
    June 29, 2006
    Thanks, Brutus, for that clarification.

    But consider this: the [MS|PC|DR] Disk Operating System was and is essentially a combination of program loader, (minimal) device driver library, and file system.

    MS Windows 3.x was and is essentially a set of graphics, multitasking and system libraries that were loaded on top of [MS|PC|DR] DOS.

    It is the custom for PC Operating Environments and Systems that don't manage all the permutations of peripherals and device drivers that are available, to get pilloried; OS/2 got pilloried, Linux gets pilloried, BeOS got pilloried.

    Why should MS Windows 3.1 be any different?  It was after all a semi-autonomous DOS extension that depended on the DOS base for initialization, etc.  For Microsoft to say they didn't take the trouble to ensure that their Windows 3.1 beta could run satisfactorily on DR DOS, either says that they were slack and lazy, incompetent programmers, or did it maliciously.  Take your pick.

  • Anonymous
    June 29, 2006
    Brian> "but ODF isn't even caught up with the innovations of today"

    You make it sound like:

    a) There is a document format out there that has a completely open and fully-specified formula/equation syntax, or:

    b) None of the applications that use ODF can load formulas/equations that they themselves have saved.

    Clearly, neither of these statements are true. Of course, if you intended a third meaning, I'd be grateful if you could be a little clearer about it...


    On the other hand, you bought the formula bits up as a rebuttal to the fact that Word 2007 will include non-portable and undocumented OLE objects inside what is meant to be a portable and documented format. As ODF formulas are in the process of being documented (which I'm sure you were aware of before I pointed it out earlier in the thread) does this mean that (and I'm only using the connection you made between the two features) MS are in the process of documenting their existing OLE-based formats to match?

    While I don't like undocumented features, I'm much less wary of them if I know they're going to be documented in short order.


    As for not warning the user that embedding non-portable objects in their "portable" document format will make it non-portable, your arguments that "If the end user decides to insert a foreign object into the document, there is not much we can do there." and "I think it would be pretty easy to have a tool that quickly notifies you if there are objects embedded in a document that may not be portable. That way people interested in portability could use the tool to easily verify it." don't appear to hold much water.

    First, they do kind of assume that the user knows which foreign objects are portable and which aren't. I'm pretty technical, and I couldn't give you a 100% answer on whether the WMF image format is portable. The user sees an image, they put it in the document, it looks fine. What more do you want.

    Secondly, a fair number of people will assume that even if something is non-portable, if they insert it into a portable document format it will be sprinkled with portability magic and be readable anywhere. It's in a portable document, right? Therefore it's portable.

    Third, the extra tool approach to check for unportable objects assumes that the person creating the document is the person interested in portability, and that they're even aware that the tool exists. If someone wants people to send them documents in MSOOX because "it's portable" and they'll be able to read it, but they get sent a document that's, say, just a big embedded, non-portable WMF file, well...

    With the above 3 in combination, I can just imagine:

    A: "I can't read your file!"
    B: "But it's in the format you requested"
    A: "Yeah, but you've put a non-portable thing in it"
    B: "But it's a portable format. Duh!"
    A: "But..."
    etc...

    shudder :-)

  • Anonymous
    June 29, 2006
    Adam, OpenOffice allows you to embed WMF images and it allows you to embed OLE objects when saving to ODF. Does that mean that you don't believe the ODF saved from OpenOffice is portable?

    -Brian

  • Anonymous
    June 29, 2006
    The comment has been removed

  • Anonymous
    June 29, 2006
    I think plain text may be a bit extreme :-)

    With Open XML, there is always an alternative representation of OLE objects (a picture), so if you aren't able to render the object you can fall back on the picture. It doesn't appear that OpenDocument does this (I'm not sure what the content type of the "ObjectReplacement" is).

    -Brian

  • Anonymous
    June 30, 2006
    Brian, having thought about some of the issues raised in this series of threads etc, I believe I may have a solution to the "controls" one.

    It would involve someone going through ActiveX and some other toolkits on the market, such as QT and GTK, and identifying similar or identical functionality.  Then releasing supplementary documentation that pinpoint the similar and identical functionality - in effect providing supplementary documentation for those functions, for which countless Windows developers will bless you (or whoever) from the bottoms of their hearts.

    And Microsoft has a group specifically devoted to interoperability - Port25 is their website - who could probably do this as part of their function.  Then QT and GTK, etc, maintainers could incorporate an ActiveX binding in their toolkits that would translate ActiveX calls to QT, GTK, etc, calls.

    That would go a long way to clearing up my objections.

    And just in closing - ActiveX may be in only 0.01% of all MS Office Documents, but you've demonstrated the use of controls in MS Office - MS Word if I remember correctly.  It's obviously meant to be used.

  • Anonymous
    June 30, 2006
    Wesley, thanks for the suggestions. I'll talk to some folks about it.

    In terms of the controls that I've demonstrated, those are actually native to Word. The storage for those controls is completely declarative in the format, and they are fully documented. The ActiveX control support is a legacy thing which we've tried to replace when possible.

    -Brian

  • Anonymous
    July 03, 2006
    > Could you give me the list of binary parts you're concerned about?

    I meant the file format. We can agree that there is an enormous number of Word (and Excel and to a lesser degree, PowerPoint) files created in the binary blob formats. I find it unfortunate that Microsoft does not wish to unlock those formats, for the benefit of those who created the documents in the first place.

  • Anonymous
    August 29, 2006
    Here's June's update (see also May's and April's ) from the Ecma International Technical Committee (

  • Anonymous
    June 01, 2009
    PingBack from http://portablegreenhousesite.info/story.php?id=14464

  • Anonymous
    June 02, 2009
    PingBack from http://portablegreenhousesite.info/story.php?id=24271

  • Anonymous
    June 15, 2009
    PingBack from http://unemploymentofficeresource.info/story.php?id=547

  • Anonymous
    June 17, 2009
    PingBack from http://pooltoysite.info/story.php?id=1317