Sdílet prostřednictvím


They’re bringing out the big guns

Anyone else been following the latest blog posts from IBM and Sun discussing the Office Open XML formats? It looks like they're stepping up their push to try make ODF the only choice in file formats. I read Tim Bray's post yesterday, but there have actually been a number of other posts folks have pointed out to me as well. Everyone knows that Sun and IBM have a lot riding on ODF financially (they're large corporations, not philanthropies <g/>). It's clear that their plan is to somehow convince governments into mandating just ODF and remove any choice between the two formats.

Thankfully, what you're actually seeing in most places is that governments are asking for 'open formats' in general, not just ODF (contrary to what is usually written in the headlines). Most of those governments understand that Office Open XML is on the verge of becoming an international standard as well and it serves a very important purpose that ODF doesn't. This has raised the alarm bells for IBM and Sun though, and that's why we see the latest smear campaign kicking into gear. It could be that this is more innocent and that instead there is just a lack of technical knowledge. Based on the strong reputations of the folks involved in this campaign though it seems more malicious. I'm saying this after reading their claims that the spec is too complex and therefore not interoperable, which is just ridiculous. Too much information? Every developer I've talked to (even those working for companies that compete directly with Microsoft) is extremely grateful for the amount of information the spec has provided. Look at the 600 developers up on the openxmldeveloper.org site building all kinds of powerful solutions across a number of different platforms (Linux; Mac; Windows).

I think it's pretty ignorant for folks to call this effort a rubber stamp. Talk to the people from Apple, Novell, the British Library, the Library of Congress, Intel, BP, StatOil, Toshiba, Essilor, NextPage, and Microsoft who spent over 200 hours in group discussions around the formats. Look at the results of all the hours that went on in the smaller groups tasked with solving particular problems or those working on the actual documentation that had to go on between the weekly group meetings. The schemas themselves changed significantly and the spec went from 2000 to 6000 pages. Rubber stamp? You must be joking. <g/>

Another thing I've seen from an IBM employee is that he's trying to get more technical by examining the Office Open XML standard looking for minor nits and then attempting to turn them into big issues. That's fine and everyone is entitled to their own opinion. It's kind of funny though that many of the issues he raises are even worse in the ODF spec.

Why would IBM and Sun push for a more limited format?

There is this false claim from some high profile IBM and Sun employees that the Office Open XML spec is not interoperable because it's too big. These statements really help to paint a picture of their strategic interest in ODF. What's the easiest way to compete with another product that has a richer set of features? Get governments to mandate a file format that doesn't support that richer set of features. This way, if the other product (Microsoft Office in this case) has to use the format that was designed for your product, you've just brought them down to your level. It's a brilliant approach, and that shows why there are IBM vice-presidents flying around talking to governments about the need to mandate ODF. It also shows why they want to discredit the Office Open XML format… IBM and Sun feel they have a lot to lose if Office Open XML is standardized, and that's why they've been fighting so strongly in opposition.

Now, contrast that with the Microsoft position, where we've never opposed ODF. We didn't plan on supporting it, but we had no problem with other people using it. The only opposition we've ever had is to policies mandating ODF and blocking Office Open XML. We want choice; IBM and Sun on the other hand absolutely want to block choice. The spin they try to put on this is that by blocking choice in formats they are providing freedom to choose your application… what they don't way though is that we're doing that to an even greater degree. We're sponsoring a free open source project for translating between the two formats, which gives everyone the freedom to choose both the application and the format. Microsoft's view has been that open formats are really important and there is nothing wrong with both ODF and Open XML. IBM and Sun on the other hand want one specific open format (ODF), and that's it.

Now, if you look at it technically, there is no reason to complain about the size of the spec unless you are trying to limit the features supported by the spec. There are plenty of large specifications out there (look at the Java spec) that are completely interoperable. As an implementer of the Office Open XML specification, you are free to decide what pieces you want to implement.

Let's think about this complaint though that the specification is too large. What are the ways in which you could fix that:    

  1. Less documentation and explanation??? - I can't imagine anyone wanting this. Remember, the standard isn't a novel you're supposed to read end to end. It's a detailed description of every piece of the Office Open XML file formats and how it all works. More documentation is an important thing in this case.
  2. Less features??? - Who gains from this? Any implementer has the freedom to pick which part of the spec they want to support. Only applications who want to compete by bringing everyone down to their level would actually want features removed.

There are a lot of features between the three main schemas (WordprocessingML, PresentationML, and SpreadsheetML), and as a result the file format is very large. The ODF spec most likely would have been bigger if they had done a more thorough job documenting it, but even then it still doesn't compare in terms of functionality. One of the other justifications I've heard for the ODF spec being so much smaller is that it reuses other standards. That may account for some, but it still doesn't get you all the way (not even close).

We also looked at reusing other standards where it makes sense (Dublin core, ZIP, XML), but there are plenty of places where that didn't make sense (MathML). Take the example of MathML. It wasn't specifically designed for representing math in a wordprocessing document, but instead math in general. It's a good spec, and it does do a decent job in a wordprocessing document, but it's not able to handle everything that our customers would expect. It doesn't allow for the rich type of formatting and edit history that most customers of a wordprocessing application would want (see Murray's post for more details). Even more interesting though, to date there aren't any ODF wordprocessing applications out there that even support all of MathML. I think that Office 2007 actually has better MathML support with our import/export funcationlity. Another example given is the use of XSL-FO. It's a nice spec to reuse, but it doesn't fully define how international numbering should be done, so as a result OpenOffice has already extended the format in their own proprietary way.

XML itself has only been a standard for about 8 years. For one to assume that all the great thinking and tough problems in the Office document space have already been handled since then is ridiculous.

-Brian

Comments

  • Anonymous
    October 18, 2006
    The comment has been removed

  • Anonymous
    October 18, 2006
    Ok, raise of hands, who else finds these posts tremendously hypocritical?

  • Anonymous
    October 18, 2006
    The comment has been removed

  • Anonymous
    October 18, 2006
    Which 600 developers? Do you mean the 583 forum members of OpenXmlDeveloper.org? Because then you are even counting me! And I am certainly not doing any openxml development. Ooh, and it seems they deleted all my comments there... strange. And why exactly don't you link to the blog posts of IBM and Sun? It's always handy if readers can vaildate your claims or come to their own conclusions. You just think that you are the good guy, but in fact you are contributing to more ms-evil. Microsoft is only open when it absolutely has to, just look at all the other formats and protocols which are still closed. Don't you think the developers on Messenger, Kerberos, Posix thought they where working on an open system?

  • Anonymous
    October 18, 2006
    The comment has been removed

  • Anonymous
    October 18, 2006
    The comment has been removed

  • Anonymous
    October 18, 2006
    The comment has been removed

  • Anonymous
    October 18, 2006
    Here is Tim Bray's blog post talking about Office Open XML: http://www.tbray.org/ongoing/When/200x/2006/10/16/OOXML-Hoo-Hah. It contains links to some other posts which I think Brian mentions.

  • Anonymous
    October 18, 2006
    Why is the leap year issue being characterized as an Excel bug when it has been pointed out previously that the behavior originated with Lotus 1-2-3, and Excel (and subsequently other products) copied it for compatibility?

  • Anonymous
    October 18, 2006
    The comment has been removed

  • Anonymous
    October 19, 2006
    The comment has been removed

  • Anonymous
    October 19, 2006
    The comment has been removed

  • Anonymous
    October 19, 2006
    Brian, I get the impression I get from your own words.  For example, you say in the post above: "I'm saying this after reading their claims that the spec is too complex and therefore not interoperable, which is just ridiculous. Too much information?  Every developer I've talked to (even those working for companies that compete directly with Microsoft) is extremely grateful for the amount of information the spec has provided." The complaint is not that there is too much information, which you neatly turn it into.  That would be laughable, and you go on to laugh at it.  But it isn't the complaint.  The complaint is that it is too complex, meaning in some cases that it is too specific to a particular implementation.  For example, the specified page borders are made part of the standard rather than left as a general page border format with the implementation left as an instance that Microsoft Word happens to implement.  In some cases it is too complex because it assumes functionality not readily available to XMP parsing languages, such as the bitwise operations.  That is not a case of "too much information", which I am certainly not faulting Open XML for, but a case of "too much complexity", which I would fault Open XML for. In another place, you say: "Now, if you look at it technically, there is no reason to complain about the size of the spec unless you are trying to limit the features supported by the spec." This is silly, and completely incorrect.  How about if we introduce a spec for integers that deals with a set of rules for integers between 0 and 100, and another set for those between 101 and 200 and so on.  the spec would be infinitely large, but would provide no more features.  It is quite possible, and frequently done in standards organizations, to reduce the size of a spec while increasing the features, by generalizing better and extracting instance data from format standards (see page borders argument above). Finally, you ask the question: "Let's think about this complaint though that the specification is too large. What are the ways in which you could fix that:" and the only too ways you can think of are "1. Less documentation and explanation???" and "2. Less features???"   Might I suggest "3. Better generalization" as a rather obvious choice? I am curious about one statement you make.  You say "It's kind of funny though that many of the issues he raises are even worse in the ODF spec." but fail to point out any examples.  As these would obviously be welcomed by both the ODF committee and those who choose to attack ODF, how about you share some specifics? Ben

  • Anonymous
    October 19, 2006
    The comment has been removed

  • Anonymous
    October 19, 2006
    RequiredName says that "ODF is a vendor and product independent specification."   This is disingenuous.  It is a actually a standard for rebranded versions of OpenOffice from Sun, heavily promoted by them and IBM (hence the latest disinformation campaign from them).   ODF was designed primarily to support the features in this product, not to provide interoperability, despite what its supporters claim.  If it wanted to really support interoperability, it would support full interoperability with the full features of the most common (99% market share) office software -- which happens to be from Microsoft.  But you cannot "round trip" from MS Office to ODF file formats and back, not because of any MS limitations, but because of ODF limitations that may takes years to fix, by their own admission.   RequiredName doesn't understand this, because he/she proposes dropping OpenXML in favor of ODF.  They are not interchangable. ODF does allow limited interoperability with other software because it is is a published standard that anyone can use to interoperate with OpenOffice clones or subsets of it.  But, even that degree of interoperability is limited, as well documented in this blog (look, e.g., at lack of spreadsheet formula interoperability). RequiredName also says that "MS OpenXML is a XML version of the legacy Microsoft Office file format. "   This is actually totally 100% incorrect.  It is not a reworking of an old file format at all; it bears no resemblance to it.  The old file format was a binary dump of internal data structures.  The ECMA (note: not MS) OpenXML standard-to-be is a completely new XML file format intended to provide an external representation of the data used by the MS Office application suite.   Until such time as OpenOffice has all the features of MS office and a future version of ODF supports all those features (and is thus equivalent to OpenXML), all that can be physically done is to provide a limited "save-as-ODF" capabilty for MS Office, that strips out information needed by those features of MS Office that are not supported by OpenOffice.  But this is precisely what MS is promoting.

  • Anonymous
    October 19, 2006
    Ben, your comment reads a lot like a discussion we've had here some weeks ago. Generalization is a good thing, especially if you create a standard without having a large volume of existing data in mind. ODF did just that, and if were that easy, there would really not be any reason for Open XML to exist. But Open XML was created with millions of existing binary documents in mind (see--i'm caught in an endless loop too). Generalizing the format would lead to the same information loss as converting to ODF. The old formats have non-generalized semantics and parameters, the old apps have a non-generalized user interface etc. Changing this without travelling in time could prove quite difficult. It all goes to prove that for the goals MS is claiming to have for Open XML, what they did is just the way to go. You want a generalized format, clean and easy to implement, go for ODF. You want something that does not break what you have, you're gonna have to accept the burden that previous versions bring to the format. Stefan

  • Anonymous
    October 19, 2006
    Stefan, There is some truth to this, but there are certainly areas where generalization could help.  The Art Page Borders section is about sixty pages of specific borders.  Now, granted, existing documents use those by name, but couldn't you generalize top a "named border" and have those specific borders be part of Microsoft's implementation?  That would reduce the spec there by about 59 pages, and it wouldn't stop Microsoft from supporting the existing documents.  I am not suggesting that the standard be completely rewritten, it is what it is, but it could use a little genaralization to separate the specific implementation from the general implementation.  Right now, it just feels like every place where choices are iterated through, the decision was made to leave in the set of choices Microsoft has already used, rather than say "Here a choice must be made and it should be named and kept as a separate resource file in the document. Ben

  • Anonymous
    October 19, 2006
    The comment has been removed

  • Anonymous
    October 19, 2006
    Brian, Hang in there.  I read Tim Bray's post a few days ago and was glad to see you respond.   I've always found your posts (and responses to comments) to be thorough, level-headed, and (I'm sure this will shock some) genuine.   Keep up the good work. John

  • Anonymous
    October 19, 2006
    I wasn't sure what kind of lightning this post would attract when I saw it in my feed reader.  It's gratifying to see the level-headedness here. Brian, that's a good catch on the terms of the debate about "choice."  I think you are on solid ground.  On the other hand, the purpose of interchange/interface is to move the debate about choice to implementations (think GDI and display adapters) and not formats.   Having said that, I think some of the debate now is about different technical sensibilities, and we are close to a new version of language wars.   It is the Universal Document Elixir (http://orcmid.com/BlunderDome/clueless/2005/10/magical-thinking-and-universal.asp), which many believe in, that has it be OK to think that the one-format choice is sufficient and has already been made (i.e., ODF).  The only thing that can test that is reality. No amount of talk will help.  And it will still be a chancy thing.  What will teach us whether magical thinking, perception and ideology can overpower reality will be all of those conversion and translator face-offs that are going to be happening in 2007.   Whatever happens, in the end we will all be the wiser for it.  I also think the TC45 effort is commendable and admirable work.  It is a difficult road, and however it turns out I think Brian and the Microsoft team that committed to this course are to be admired for extending the tremendous effort involved, all with the support of the responsible management.

  • Anonymous
    October 19, 2006
    The comment has been removed

  • Anonymous
    October 19, 2006
    >The specific examples being raised so far >are definitely minor issues. I dont think so XML defined in section 2.8.2.16 (page 759) of Volume 4 the OOXML  is a dump of the Windows SDK memory structure ( http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_5ppu.asp )   Regarding interoperability and same-level-playfield for implementors, this is definitely not a minor issue

  • Anonymous
    October 19, 2006
    Marc, first off I'm suprised you find that to be a major issue. Those attributes are used by a font to describe which code pages and Unicode subranges the font actually provides glyphs for. It's primarily used in font substitution (ie you don't have the font on your machine and so the consuming application looks for other information that can help it determine another font to use). If you have the font, then you can get all this information just by querying the font. It's cool that you care so much about the fonts themselves though. Most times, you mainly just care what the name of the font is that is used, but this additional information can greatly help with cross platform collaboration where the fonts on your machine may not be on the persons machine that you send the file to. That's where this element as well as the other ones like panose information, etc. come into play. I personally wouldn't call the code page declarations to be anything major though (which is why I said it was a minor issue). The list of properties for a font that are defined by seperate elements are:

  1. altName
  2. charset
  3. embedBold
  4. embedBoldItalic
  5. embedItalic
  6. embedRegular
  7. family 8 notTrueType
  8. panose1
  9. pitch
  10. sig (this is the one in question) While I wouldn't call the storage of the sig attributes a major issue, it's definitely something we can drill into more if you're interested. Even though it's a hex dump, the values are all completely defined, so while it would be somewhat difficult using XSLT and XPath to parse it, it would be pretty trivial using any other number of programming languages (and there are actually online examples of how to do it with XPath/XSLT as well). I guess in certain ways there may have been a better way to do this, but I'm not sure it would have really have been worth it. It's a bummer this wasn't brought up sooner though (via public comments or by joining the Ecma TC), because it's certainly something we could have worked together on to see if it should be done differently. Is this something you came across, or was this based on Rob Wier's post about bitmasks?  Again, if it's really a big blocking problem I wish it was brought forward earlier. I don't think that's the case though. While one could argue that it could have been more xml-friendly, it certainly is fully documented and so I don't think it's a barrier to interoperability. The subranges for csb are all defined in the spec, and the subranges for usb are all done according to the ISO 10646 standard. -Brian
  • Anonymous
    October 19, 2006
    The comment has been removed

  • Anonymous
    October 19, 2006
    Ben, i see your point, but after following the discussion here for some time (and the awkward border thing was mentioned here before), i really think you picked out one of just a few places where an obvious improvement would have been possible. (and this one's really hard to miss, being that long!) so yes, they could have reduced the specs there by 60 pages. probably some more here or there. but taking away a significant portion of those 6000 pages is a completely different story. Stefan

  • Anonymous
    October 20, 2006
    The comment has been removed

  • Anonymous
    October 20, 2006
    Stefan, You may well be correct, and yes that is one example that fairly leaps out from the specs, even aside from the discussions here. I guess my one issue with both sides of this argument is similar to my issue in most highly polarized debates... both sides seem to be willing to take almost any extreme position to defend "their" side, and be willing to nitpick almost any minor issue on the other side, or worse, to make broad generalizations that don't hold water.  Brian does it here, mostly with the defense of the almost indefensible (e.g., the leap year bug) and with broad generalizations about the inability of ODF to support multitudes of existing documents, which is almost certainly untrue given the huge preponderance of documents that use only a tiny portion of the features.  Those on the ODF side equally like to act as if there is nothing substantial that ODF can't handle, which is silly, as it is a much less proven format which does not handle formulas at the complexity level necessary, among other things.  They also want to make broad generalizations about Microsoft's intent and vague warnings about monopolies, etc. etc., and act as if open source, open standards, open anything is better than proprietary anything.  Also hogwash. What bothers me is that both sides tend to weaken their own strong arguments by refusing to budge on the weaker arguments.  For example, Brian Jones defends the MathML decision but, just as ardently, the leap year bug, thus weakening a very defensible position by supporting an indefensible one.  Acting as if every decision made by the Ecma TC committee was reasonable and right is not going to gain you much credibility.  On the other hand, Bob Sutor of IBM seems to want to take any chance to put down Open XML and the work being done on it, and act like any reasonable government, company, etc. would choice any open source standard over any proprietary standard because... well, just because.  This also loses credibility because there are plenty of different issues people have to deal with, and whether a standard is "open enough" is not the single litmus test.  Reasonable people will choose to focus on Open XML, and resonable people will choose to focus on ODF, and both sides should be wary of irritating and alienating those reasonable people. ODF is not the savior of the world, and Open XML is not the devil incarnate, but neither is it the other way around.  These are two formats with different strengths and weaknesses.  Each could learn something from the other, but not if everybody chooses to make it all out war. Ben

  • Anonymous
    October 20, 2006
    The comment has been removed

  • Anonymous
    October 20, 2006
    @marc So your big issue is that you do not like the standard. And the interoperability that you mentioned ? That wasn't an issue at all I guess ? It just sounds interesting. Hmmm, and about implementable standards. Interesting examples you bring forth. HTML , PDF. Those you call implementable standards ??? Have you ever tried implementing PDF ?  Lucky you did not mention CSS as we are stil waiting for any application that manages to implement CSS 2.1 fully let alone 3.0 I also do not see how the standards is more difficult to implement by the bitmask thing ? If you change the bitmask thing to specs in XML does it get less complicated to implement ????? It seems you read the Rob Weir blog but when reading his higly amusing blog you should also read between the lines. I am still waiting for Rob to point out some more serious issues with OOXML as he seems to have an army of people digging the spec trough for him or he must spend half his life in bed with the OOXML spec. ;-)

  • Anonymous
    October 20, 2006
    hAl , i consider CSS a poor standard, the layout of objects in a page is not rocket science !! why did W3C complicated the thing so much ?? ( disgression ) ghostscript ( and other PDF viewers with distinct code bases ) gives me near 100% fidelity in PDF

  • Anonymous
    October 22, 2006
    The comment has been removed

  • Anonymous
    October 23, 2006
    Stefan I have seen at least as many attacks from MS on ODF as from ODF backers on MS, possibly more.  Brian is certainly not the only voice in this regard, and he has not quite been the voice of reason that you suggest.  The MS approach has been to complain heartily about any minor feature of MS Office that ODF would have any trouble supporting, or to dismiss the more general approach ODF has to some features, while the ODF approach has been to complain heartily avout every feature of MS Office that MS overly specifies to lock in its own implementation.  I don't see one approach as more noble or generous than the other. Also, while I don't agree with IBM's criticism of Microsoft for providing Open XML, it should be clear that from the public's point of view, if the two standards are seen as equally beneficial, Microsoft wins.  For that matter, if Open XML is even seen as nearly as good as ODF, Microsoft wins.  That is why the strategy seems to be to advocate for complete adoption of ODF, as the only possibly winning strategy.  Again, I don't agree, but I understand.  What I don't understand is why Microsoft doesn't seem to get that if they don't lose, they win.  Instead of stooping to petty attacks, simply act as though Open XML and ODF are equivalent, since Microsoft will still win with that strategy. Lastly, I don't think the "minor" mistakes are so minor, if the Open XML is supposed to be a standard.  Microsoft has simply had strategic reasons for taking a really good move (exposing its formats and documenting them fully) and projecting it as something else (a truly open standard).  If MS simply claimed that it was doing the first, I'd be the first to appluad.  The latter cheapens the concept of open standards, because this is so clearly a single implementation standard.  Lots of people will write to Open XML, but the vast, vast majority will only be providing integration with MS Office.  An open standard is not just an exposed interface or API, and MS is only claiming Open XML is an open standard to stop the ODF success at getting governments to push for open standards.  I just don't like that approach, although again, I do understand the strategy.

  • Ben
  • Anonymous
    October 23, 2006
    So much for Ben being a "voice of reason".

  • Anonymous
    October 23, 2006
    Ben, if you read carefully, the voice of reason was awarded to you ;-) I have yet to see attacks from MS matching those from the ODF camp. OOXML has an advantage, true enough. I agree totally that ODF must have more that just mixed content or anything to win here. But, outright lying in order to force administrations for regulation is still unacceptable. And that's how I perceive a lot of the ODF propaganda, period. About open standard vs. product spec: Yes and no. Yes, it's clearly made to support the functionality of Office products and existing documents. But that would just be what a lot of customers want, whether they decide to move towards OpenOffice or MS Office. Except that OO can't handle the details right now - a clear advantage for MS. Not too fair either, because it's not really a great achievement to build on one's own code base. But that's really the OO guy's problem, why should customers care? OO could answer this by putting priority tags on OOXML features they do not support yet and start implementing them. Based on how important they might be for customers. The rest of OOXML could be preserved until OO one day supports it. Sure, this is a lot more additional effort for OO, but it would serve customers well. And it would make OOXML an open standard. It's in their hands. They choose to ignore minor or less frequent problems for existing MS customers and MS Office features, that's a fairly reasonable decision too. Simplicity comes with a price, but it's also an advantage. But they should stop complaining that MS won't support their agenda right now. There just is no way MS would completely switch to a format that does not completely support their features. They wouldn't want to, their customers wouldn't let them, and complaints would be even worse if they extended ODF to carry all the additional info, like they once did with HTML. So, what do you really expect MS to do? Stefan

  • Anonymous
    October 23, 2006
    Stefan - I don't mind being the voice of reason, but not the voice of Microsoft, and being reasonable in this case means seeing some of both sides.  Open XML is not designed to be an open standard, it is designed to expose Micrsoft's existing document  formats better.  While that is a good thing, it does not make Open XML anything like a good standard.  I don't think it even serves Microsoft's long term interests.  The complaints about Open XML may seem minor to you or Brian, but they seem symptomatic of the way this "standard" is implemented.  Again, that doen't mean Open XML is bad, or even that Microsoft should not be applauded for creating it.  It is a big move forward for Microsoft to open up a bit, and I am very glad they have, but Open XML does not seem like a very good standard. Of course, Open Document Format is not that great either. It is coming along, but has some glaring problems.  The biggest difference is its openness, which means it can change without a single company's veto.  It can evolve, and is likely to do so, and the ways it will evolve are likely to make it stronger, because they are open to argument and debate and influence by many sources.  Does that mean everybody should adopt it now?  No, probably not, but in the long term, ODF is likely to be the better bet for general use.  I fully expect that Microsoft will one day support ODF natively, but not until they are convinced that that is what it will take to compete, much the wau IE7 supports web standards better than IE6 because Microsoft saw the writing on the wall and wanted to stay in the game. I guess the question I have for you is, if Microsoft's applications are better, why wouldn't they compete on them alone?  Open XML is a good internal format, but Microsoft could certainly create excellent translators to ODF, and could contribute heavily to making ODF a standard that met their needs.  Then, if everybody wanted to use MS Office, they could.  So, why doesn't Microsoft do that?  It seems every bit as fair as your question.

  • Ben
  • Anonymous
    October 24, 2006
    from the MS Office XML "standard": ..... 2.15.3.63 useWord2002TableStyleRules (Emulate Word 2002 Table Style Rules) This element specifies that applications shall emulate the behavior of a previously existing word processing application (Microsoft Word 2002) when determining the formatting resulting from table styles applied to tables within a WordprocessingML document. [Guidance: To faithfully replicate this behavior, applications must imitate the behavior of that application, which involves many possible behaviors and cannot be faithfully placed into narrative for this Office Open XML Standard. If applications wish to match this behavior, they must utilize and duplicate the output of those applications. It is recommended that applications not intentionally replicate this behavior as it was deprecated due to issues with its output, and is maintained only for compatibility with existing documents from that application. end guidance] Typically, applications shall not perform this compatibility. This element, when present with a val attribute value of true (or equivalent), specifies that applications shall attempt to mimic that existing word processing application in this regard. .... i'm not sure if this kind of elements should exist if it "cannot be faithfully placed into narrative for this Office Open XML Standard"

  • Anonymous
    October 24, 2006
    robilad: The JSR 277 early draft catches the worm "Why didn't OSGi/Maven/Ivy/NetBeans Modules/JAR Manifests/whatever-else-is-there become the one true way to deal with modules in Java yet?" - and Dalibor's got questions on the subject as well (tags: JSR

  • Anonymous
    October 25, 2006
    Ben, what you dislike about OOXML is the price for compatibility. Being 100% compatible with existing documents is an issue for bot MS and its users. The only reason it is not so much of an issue for other applications is that they so not match MS office feature-wise 100%, so a perfectly compatible format would not help there. A product that could compete with MS Office in terms of features would face the same compatibility issues MS does, only there is none, at least not yet. Still, OOXML bears the possibility of perfect comtabilitity, and any vendor is free to implement as much of it as they want or at least preserve document content they cannot process. Whether other vendors adapt OOXML (and to which degree) will be influenced by business decisions, so it's no measure for the quality of the standard. OOXML is not as good as ODF as a lean and mean, easily implementable standard. But ODF is not as good feature-wise and in terms of compatibility. Saying either one is better than the other means to completely miss the point. They have different goals and different tradeoffs. Microsoft supporting ODF natively would mean that for an ODF document, MS would have to disable or change MS-specific features in the UI, everthing else would be frustrating for the user. I don't believe any other company in MS's situation would make this move. It would actually mean that MS office apps would need to have two modes, one for compatibility with old binary formats, and one for compatibility with ODF. The user would then have to make a concious decision about when to convert (and possibly lose information) and continue to work in the new format/feature set/UI. They would have to master two different modes without getting confused. Sounds like science fiction to me. Users don't care. MS could have tried to improve the ODF standard to include MS-specific features, true. I'm not sure if the other members would have accepted their proposals. (They have good reasons not to, it's just a matter of justifying it publicly. Personally, I see the current propaganda machina as a sign that they probably wouldn't have hesitated to prevent or slow this influence using any arguments they could say without laughing out loud.) Still, let's say it could have been done. For a minute, ignore the facts that

  • MS had already made investments in its own XML format by then
  • the OASIS group was starting from the OpenOffice format (MS joining this would be the tail wagging the dog, not?)
  • that it was unclear how much and how fast MS could have influenced the progressing standard. You would still end up with problems. If everything works out fine, the format would include every major feature that differentiates MS Office (assuming that sun and IBM would let that happen to ODF, again). You would still end up with either one of two problems: a) ODF would not accomodate every office quirk from pre-2000 version (so conversion would be lossy), or b) ODF would become OOXML, including all the complexity and backwards compatibility stuff you don't like. who would win, then? Stefan
  • Anonymous
    October 25, 2006
    I think Stefan's last post here sums up some of the arguments nicely, although I disagree about what Microsoft would have to do to directly support ODF as a peer format to RTF, HTML and all the other formats Office 12 supports. No-one is asking them to /only/ support ODF in the current context, just to /support/ it. ODF was indeed designed with MS Office portability in mind and as Stephe Walli points out would have had a profound influence on the format had they joined the OASIS group in 2002. I have switched formats before (Displaywrite -> Wordstar -> Wordperfect -> MS Word -> Smartsuite -> OpenOffice.org) and the nemesis scenarios don't hold water, I survived each time. What I want to see is my next format switch to be the last one. I continue to support ODF because I believe a product monoculture to be harmful in the long term, and because Office 12 XML format does little to address that. I can completely understand Microsoft's desire to protect its monopoly and some of its MVP-types agreeing that the monopoly is their preferred choice over an open market. And I can understand how documenting a data dump of the internals of Office 12 makes it easier to handle Office 12 documents. And I am sure there are valid issues in the ODF specs, as in all specs. What I don't hear much from the Microsoft folks is an equal understanding of the inverse of each of those issues or any attempt to address them. That's why I don't drop by here much any more. I only found this entry because of a Stephen O'Grady link since Brian didn't link to any of his critics (presumably to avoid sending us any traffic) so it didn't appear on any logs. That's characteristic of Microsoft's attitude here, unfortunately. ODF is still needed because that siege mentality still exists.

  • Anonymous
    October 26, 2006
    Simon, thanks the praise, even if you seem to disagree with my post although it sums up everything so nicely. Still, i'm having a hard time relating your answer to the arguments I presented, so I'll try and make it easier for you:

  • comparing to HTML/RTF MS was free to make RTF accomodate everything they needed, it was much closer to OOXML than to ODF therefore. what do you compare here? what word did to HTML was just horrible. you want this to happen to ODF?
  • "No-one is asking them to /only/ support ODF" yes, they are. read some of the comments here, for starters. many ODF supporters claim that it was an evil move of MS to create OOXML in the first place.
  • "ODF was indeed designed with MS Office portability in mind" yes, so that a reasonable conversion could be done. but perfect portability was not a goal, and that's what MS customers would certainly expect.
  • "... would have had a profound influence on the format ..." I would not expect anybody to say anything else, still there would be good reasons for MS's competitors to be not so forthcoming when discussing the details. standards is politics. they would damage their business, and there are good technical arguments too: for perfect compatibility with MS office, ODF would have become OOXML. I think having two standards is the better way for both parties.
  • "I have switched formats before" So have I. You loose stuff. You cannot print important documents after conversion without checking every detail. there are conversion errors, and there are things that cannot be converted directly. did your documents contain macros? you need more than the language, it's a litte like javascript and DHTML: if the DHTML object models are not the same, it's no help that both browsers use the same javascript language.
  • "I believe a product monoculture to be harmful in the long term, and because Office 12 XML format does little to address that" This is only true because the only serious competitor already had very good support for the binary formats. OOXML gives you much easier interoperability, thus making it accessible for everyone. the binary stuff was only an option for huge project teams like sun's, OOXML is within reach for small projects. OpenOffice is making good progress ending the monoculture problem by providing a free, platform-independent solution for those who don't need everything that MS office provides. They are still far behind MS in number of users, but do you really think that's because of the format, which they do support after all? If not, ODF won't break the monoculture. It's just a good technical ground for a world after that monoculture.
  • "MVP-types agreeing ..." Your guess is correct, I'm not developing software for Linux. But does that mean I spend my evenings making up ridiculous arguments for OOXML to support MS's monopoly? Hardly. As someone who develops custom solutions for office platforms, not office products, I like some of OOXML, especially the good support for custom schemas and the separation of content and references. Besides this, I would like ODF just as well, anything based on XML in fact. I'm not going to touch the formatting details of any format anyway, so that's all fine with me. Still, I find it hard to believe how much misinformation is intentionally spread around OOXML, and how hard it seems to be for some people to believe what people like brian say, because it makes perfect sense to me. I just don't like going to customers, trying to design a solution that includes office documents, and being confronted with stuff that seems to be taken straight from slashdot.
  • "documenting a data dump of the internals of Office 12 ..." I think everyone who repeats this old allegation should know better. In order to reload what you have in memory, you need a format that contains all the information. Backwards compatibility sometimes is a mess, too. Take the picture format options of word, for instance. Or the headline numbering options. How could a 100% compatible format NOT contain every single value, no matter how MS-specific this would be? Prejoratively calling this a data dump is not helpful in this discussion.
  • as for your last paragraph, brian links to more OOXML critics and ODF supporters than I care to read. post those links you miss here. btw, your link is broken. Stefan
  • Anonymous
    October 26, 2006
    All these discussion leave me wondering: what kind of interoperability IS possible between different office products? I think there are three things that can be done: a) Agree on a set of features and their parameters. Every word processor can agree on paragraphs, page breaks, bold and italic, etc. The repesentation of those are interchangeable. But there are also things that can be handled in many different ways. Think about complex issues like positioning, anchors, or numbering. The internal representation of the program determines the way the user can interact with it, and vice versa. b) Make good converters. This can hardly be a perfect solution, though. Even if both programs have a feature for everything the user wanted to do with his document: Programs don't caption the intention of the user when he designs the document, but what he does to the internal data repesentation. Without knowing the intention, a converter can only guess what the user would have done to the data of another program. Sometimes it's obvious, somtimes a heuristic approach as to be taken. Sometimes this is acceptable for users, sometimes it's not. For updates, it's usually not. c) A combination of a) and b) plus preserving the original information that cannot be losslessly converted  (until it is touched by the user in the target program). So, the perfect way would be to standardize everything in office applications that has reached commodity status, and then allow products to innovate and compete based on conversion and preservation, right? Not quite. Unfortunately, you cannot standardize the behaviour of legacy versions. But then, there's only one family of legacy document formats you really have to consider (given the numbers of existing documents), and that's the binary MS office formats. So, wouldn't it be perfect to standardize on their features first? Again, no, because they carry so many idiosyncrasies from the past. Supporting those is hard work, and a fresh start has many advantages. So, many vendors will choose not to compete with MS on its home turf (compatibility) and rather target new customers and those who can accept limited compatibility. So, is there a perfect way? I think not. But having ODF and OOXML (and acceptable converters between those) seems pretty good to me. Which one to use is a decision only the customer can make. Everything else seems to be just people trying to sell their products. What can be done to improve the situation? I think the only think that can be done is adding good standardized format preservation support to both formats. Which is probably not half as easy as it sounds. One of the reasons this is so interresting is that this seems like an anology for so many other types of data. UML models is one area where I've experienced similare problems, ranging from details in the representation to round tripping issues. But office formats get so much more attention.

  • Anonymous
    October 26, 2006
    Stefan, I think you misunderstand me.  You say "what you dislike about OOXML is the price for compatibility", but I don't dislike OOXML.  Actually, there are a few things I dislike, but nothing serious.  What I dislike is that Microsoft claims that it is an open standard, whereas OOXML seems very much like a fully disclosed vendor specification.  Mind, I think that is GOOD!  I don't even mind the oddities much, as a vendor specification.  But the idea that Microsoft wants it both ways, as a very implementatio-specific format AND an open standard, is not going to work.  But let me repeat... I don't object to OOXML, just the farce that making it a pseudo-standard represents.  Yes, MS needs to support every one of its umpteen gazillion legacy documents.  They should be praised for doing so, and for making the specification publicly available.  They should even be praised for the covenant not to sue, etc., but those things still don't make this an open standard in the way ODF is. My argument is not about whether ODF or OOXML is better or worse (there are arguments for each side), but that ODF is an open standard that is designed to be improved for the good of the general populace, while OOXML is a basically closed standard that is designed to support Microsoft's many, many customers and gazillions of legacy documents.  Telling me how much better one is than the other at supporting legacy MS documents hardly addresses my argument.

  • Ben
  • Anonymous
    October 27, 2006
    @Ben I think OOXML is as much a vendor specification as ODF is. Frankly I think the current ODF specs are unusable without a reference implementation. There is no way people will be interoperable on ODF without a reference implementation just from reading the ODF specs and builidng on that (unles they create only the simplest of Office files). That means currently the ODF specs are almost useless without the main implementation that OpenOffice provides (mayby supplemented by kOffice). Even allthough the OOXML specs are much more detailed I would dare anyone creating fairly complex documents that are compatible with MS Office without looking at documents created using the reference MS Office implementation doucments.

  • Anonymous
    October 27, 2006
    Ben, maybe I did misunderstand you a bit, or at least exaggerate the reservations you have. From what you said, there are definately things that make ODF a better standard in the sense that it is cleaner and less complicated than OOXML for the subset of functionality that is covered by both. More generalized etc. And I'd agree. But besides this, I still disagree with you. MS office documents is basically everything that's out there now. MS office customers are the target of every other office product. Therefor, perfect compatibility could be a design goal for any other product too, and this would make OOXML a perfect candidate for an open standard. I understand that MS's competitors choose a different approach (as I explained in my response to Simon, for both business and technical reasons). But that does not change the fact that MS is actually offering something that could be a widely adopted open standard. It's not for MS to make the decision whether other products will adopt it as a primary format or not, though. We'll have to wait and see. If Sun and IBM really fail to establish ODF as a widely used standard (in number of users, not products), they might choose to join ECMA later. I think both the spec and the process are open enough to make it happen. @hal Many open standards require a reference implementation for real-world interop. I don't think this makes either ODF or OOXML less of an open standard. Stefan

  • Anonymous
    October 27, 2006
    @Stefan I agree that a reference implementation is of ten needed / usefull. However that makes claims of OOXML being a Microsoft format as much valid as ODF being an OOo format as those as the main implementations that will be used as a reference.   The fact that OOXML is now an open format gives it the oppertunity to move on and develop and even depreciate MS legacy specific features in future versions.