Udostępnij za pośrednictwem


Mapping documents in the binary format (.doc; .xls; .ppt) to the Open XML format

I wanted to call everyone's attention to a few interesting developments in Ecma's proposed disposition document related to the Office binary formats. There were a few comments from national bodies that asked about the documentation of the Office binary formats and the availability of those documents. We had already been talking about these issues in TC45 where there were a number of existing experts in the binary formats (including Apple, Novell, and Microsoft). Based on the feedback from the national bodies, Microsoft decided last week to take some additional steps in this area.

The first issue National Bodies were interested in was easier availability of the documentation of the binary formats (.doc; .xls; .ppt). It sounded like the main concern here was around the extra steps required to get the binary documentation. The current form of the documentation has been available since 2006, where anyone could get the documentation by sending an email to Microsoft as described as https://support.microsoft.com/kb/840817/en-us. The documents were available royalty-free under RAND-Z. We already have hundreds of companies, including IBM and SUN, as well as government institutions who have the documents. The new proposal we (Microsoft) made to Ecma TC45 was that we'd just get rid of the need to send an e-mail and we'd provide it for direct download under the OSP. TC45 thought this was a good solution, and here was the TC45 response to the national body comments:

Documenting the Microsoft Office "binary" file formats (i.e., .doc, .xls, and .ppt) (the "Binary Formats") is not the intention or in the scope of DIS 29500.

However, Ecma International  discussed this subject with Microsoft Corporation. Microsoft indicated that the documentation of the Binary Formats has been available royalty-free under RAND-Z to anyone who requests it by sending an email to officeff@microsoft.com, as described at https://support.microsoft.com/kb/840817/en-us.  Microsoft indicated that many companies and public institutions have asked for and received the Binary Formats since Microsoft started providing access to this documentation. 

Nevertheless, in response to requests for even easier access to the Binary Formats, Microsoft has agreed to remove any intermediate steps necessary to get the documentation, and will post it and make it directly available for a direct download on the Microsoft web site.  Microsoft will also make the Binary Formats subject to its Open Specification Promise (see www.microsoft.com/interop/osp) by February 15, 2008.

The second issue we had feedback on was an interest in the mapping from the binary formats into the Open XML formats. The thought here was that the most effective way to help people with this was to create an open source translation project to allow binary documents (.doc; .xls; .ppt) to be translated into Open XML. So we proposed the creation of a new open source project that would map a document written using the legacy binary formats to the Open XML formats. TC45 liked this suggestion, and here was the TC45 response to the national body comments:

We believe that Interoperability between applications conforming to DIS 29500 is established at the Office Open XML-to- Office Open XML file construct level only.

Prescriptive guidance on, or tools to enable, transformation from Microsoft Office  "binary" file formats (i.e., .doc., .xls, and .ppt) (the "Binary Formats") to Office Open XML formatted files is not the intention or in scope of DIS 29500.  As a result this request is outside the bounds of this process. 

It is important to note that substantial use is being made of both the Binary Formats and Office Open XML in the marketplace today.  Many products (such as OpenOffice.org) support the Binary Formats. Microsoft has indicated that many companies and public institutions have received the documentation for the Binary Formats, and are working with it at this time, and can create mappings between the Binary Formats and Office Open XML. Translators from the Binary Formats  to XML formats such as ODF have already been developed and are in wide use. For example, the Sun ODF Plug-in for Microsoft Office (https://sun.systemnews.com/articles/112/3/sw/18208) states that  "The plug-in allows users the ability to seamlessly convert Microsoft Office documents to and from ODF. The ODF plug-in supports Microsoft Word, Excel and Powerpoint".

Likewise, there is widespread use of Office Open XML in the marketplace today across platforms and applications.  A few examples include the implementations released by Apple (Mac OS X Leopard, iWork 08, iPhone), Adobe (InDesign), Microsoft (Office 2007, Office 2003, Office XP, Office 2000, Office 2008 Mac OS X), Novell (Suse Open Office), Google (Search / Preview), Mindjet (MindManager), Intergen, OpenXML/ODF Translator (Open Source project on Sourceforge), Dataviz (DocumentsToGo on Palm OS, MacLinkPlus on Mac OS X Leopard), NeoOffice, Altova (XMLSpy), MarkLogic (XML Content Server), Datawatch (Monarch Pro), QuickOffice  (QuickOffice Premier 5.0 on Symbian), Altsoft (XML2PDF Server 2007) and those under development by Corel (WordPerfect), AbiWord, Gnome (GNumeric),  Xandros, Linspire, Turbolinux and others.  These implementations are now available on many platforms, including Linux, the Macintosh, Windows, and handheld devices (PalmOS, Symbian, iPhone, and Windows Mobile).

The widespread use of both  Binary Formats and Office Open XML formats indicates that, at this time, 3rd party can use both formats and build mappings between them.

Nonetheless, Ecma International discussed this subject with Microsoft Corporation, the author of the Binary Formats.  To make it even easier for third party conversion of Binary Format-to-DIS 29500, Microsoft agreed to:

  • Initiate a Binary Format-to-ISO/IEC JTC 1 DIS 29500 Translator Project on the open source software development web site SourceForge (https://sourceforge.net/ ) in collaboration with independent software vendors.  The Translator Project will create software tools, plus guidance, showing how a document written using the Binary Formats can be translated to DIS 29500.  The Translator will be available under the open source Berkeley Software Distribution (BSD) license, and anyone can use the mapping, submit bugs and feedback, or contribute to the Project.  The Translator Project will start on February 15, 2008. 
  • Make it even easier to get access to the  Binary Formats documentation by posting it and making it available for a direct download on the Microsoft web site no later than February 15, 2008.  The Binary Formats have been under a covenant not to sue and Microsoft will also make them available under its Open Specification Promise (see www.microsoft.com/interop/osp) by the time they are posted.

We will modify DIS 29500 to include an informative reference to the SourceForge project.

I think that both of these items are great news for folks interested in documents and document file formats. There will be a lot more information around both of these pieces of work over the coming weeks, but I wanted to make sure people realized that this was already in the works.

-Brian

Comments

  • Anonymous
    January 16, 2008
    Brian Jones has some good news today for developers who want to work with both the binary Office formats

  • Anonymous
    January 16, 2008
    Brian Jones has some good news today for developers who want to work with both the binary Office formats

  • Anonymous
    January 16, 2008
    The comment has been removed

  • Anonymous
    January 16, 2008
    And therein is the beauty of the OSP... nothing to sign, free and available to all.

  • Anonymous
    January 16, 2008
    Very good news, indeed!  Thanks for the information.

  • Anonymous
    January 16, 2008
    The OSP is not sublicensiable, thus forbidding GPL distribution: http://www.microsoft.com/interop/osp/default.mspx "This is a personal promise directly from Microsoft to you, and you acknowledge as a condition of benefiting from it that no Microsoft rights are received from suppliers, distributors, or otherwise in connection with this promise." Even Microsoft acknowledge the lack of sublicensiability: http://www.microsoft.com/interop/osp/default.mspx#EYH "There is no need for sublicensing."

  • Anonymous
    January 16, 2008
    Any chance Microsoft will want to take the next step and turn the binary formats into an open standard ala Adobe/PDF?

  • Anonymous
    January 16, 2008
    @Rajiv: Oh noes! Please let them die already!!

  • Anonymous
    January 16, 2008
    Awesome. Looks like MS is really opening up. Now we can finally bypass the email to obtain process. And will the translator project support batch conversion from binary to OOXML?

  • Anonymous
    January 16, 2008
    Lori, The OSP is just fine with the GPL, the rights that you have are also enjoyed by recipients of the software (section 7). Miguel.

  • Anonymous
    January 16, 2008
    Acum ca formatele Open XML au prins viteza, ce vom face cu alea "vechi", binare? Specificatia pentru

  • Anonymous
    January 16, 2008
    Brian Jones carries the news that Microsoft will make the Binary Formats (.doc; .xls; .ppt) directly

  • Anonymous
    January 16, 2008
    The comment has been removed

  • Anonymous
    January 16, 2008
    Brian Jones writes in his blog about some new developments with the Microsoft Office binary file formats.

  • Anonymous
    January 16, 2008
    The comment has been removed

  • Anonymous
    January 16, 2008
    Brian Jones writes in his blog about some new developments with the Microsoft Office binary file formats

  • Anonymous
    January 16, 2008
    I don't understand. Up until now, Microsoft has been claiming that access to the binary formats was difficult without a signed license as these binary formats were so closely related to the way that the Office applications lay out data structures in memory that it was impossible to produce a specification of the binary formats without giving also away the source code or at least the inner workings of Office, which might change in the future anyway. This was the excuse, from what I remember, for not documenting the binary file formats for all these years (over a decade and more). I don't understand this sudden change of mind, which is welcomed of course, but it begs the question if you were honest all these past decade saying that the main reason that you couldn't provide the binary formats was not for any unfair competitive practice (as your competitors were claiming) but purely because of technical difficulties since revealing the binary formats would enable competitors to learn the inner data structures of Office, which were difficult to document anyway. I don't understand. Also, it seems that all the noise nowadays is around Office document formats. However, Office does not only contain doc, xls and ppt but there are other binary formats in there also potentially useful for other applications to be able to read. What about Onenote and Infopath? Shouldn't I be able to have access to my notebook on every device and for ever? Or is it because for the time being Onenote and Infopath are not very popular and thus but no strong competitor to release xml specificaions first so as to prompt Microsoft to release their own? I don't understand. And what about Access and Outlook. What about an open database format? Isn't that important? If governements want their citizens to have access to their documents for ever, then should the same be applicable to the data stored in public databases? Or, is Microsoft waiting for Oracle to announce an open database format first, wait for it to be standardized and then "remember" that you should create your own competing open database format? And then again "remember" that you should document the binary Outlook PST and Access+MSSQL formats as well? I don't understand. Where is the stradegy, the ultimate goal? If the goal is simple access to information and easy portability then you should at least document all the Office and other Microsoft products binary formats and give guidance on their access or at least give a roadmap or formulated stradegy.

  • Anonymous
    January 16, 2008
    Az Open XML szabványosítása körüli felhajtásban kicsit elfeledkeztünk

  • Anonymous
    January 16, 2008
    Bisher waren die Spezifikationen für die Office-Binärformate nicht zugänglich? Nicht ganz richtig. 2006

  • Anonymous
    January 16, 2008
    The comment has been removed

  • Anonymous
    January 16, 2008
    Bisher waren die Spezifikationen für die Office-Binärformate nicht zugänglich? Nicht ganz richtig. 2006

  • Anonymous
    January 16, 2008
    [quote]Many of those third parties can only use Microsoft's binary formats due to reverse engineering - since the spec doesn't actually contain information on how to do things,[/quote] So it is actually a lot like the ODF specification that also does not tell you how to do things. Amazing.

  • Anonymous
    January 16, 2008
    @lori [quote]The OSP is not sublicensiable, thus forbidding GPL distribution: http://www.microsoft.com/interop/osp/default.mspx[/quote] GPL distributions are about copyrights on source code. Not about rights on a format specification. If you build source code based on the format specification that source code has its own copyrights and can of course be distributed fine under the GPL. And if in your source code comments you want to refer to the format specification than it is common practise to do so by referencing the original source and now that can also be done easily. So in fact the OSP licensing of this format is actually fully compatible with implementations in GPL.

  • Anonymous
    January 16, 2008
    The comment has been removed

  • Anonymous
    January 16, 2008
    HERE is the Gotcha. I don't see any promises or commitments that ALL FILE FORMATS will ALWAYS BE AVAILABLE, much less in a timely manner to ensure interoperability.  Note the use of "existing versions". From the OSP page: Q: Does this OSP apply to all versions of the standard, including future revisions? A: The Open Specification Promise applies to all existing versions of the specification(s) designated on the public list posted at http://www.microsoft.com/interop/osp/, unless otherwise noted with respect to a particular specification (see, for example, specific notes related to web services specifications).

  • Anonymous
    January 16, 2008
    Dokumentace k binárním formátům Microsoft Office je veřejně dostpná již delší

  • Anonymous
    January 16, 2008
    Dokumentace k binárním formátům Microsoft Office je veřejně dostpná již delší

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    Much better URL for the Sun ODF Plugin 1.1 is to use the official page: http://www.sun.com/software/star/odf_plugin/index.jsp One can only wonder whether the bizarre page linked was selected on purpose?

  • Anonymous
    January 17, 2008
    Mike Lieman wrote: "HERE is the Gotcha. I don't see any promises or commitments that ALL FILE FORMATS will ALWAYS BE AVAILABLE, much less in a timely manner to ensure interoperability.  Note the use of "existing versions"." You seem to be totally confused.  Once ANY file format specification (not just OOXML) is published, it is ALWAYS AVAILABLE (unless everybody in the world accidentally loses their copy of the spec -- not likely!).  You can always refer in the future to the spec and write code that makes use of the spec. As an analogy, once the ASCII spec for character sets was written down decades ago, it became ALWAYS AVAILABLE.  Got it?

  • Anonymous
    January 17, 2008
    Anonymous Coward writes: "How there can be widespread use of OOXML when the spec is in turmoil in the ISO/ECMA process and nobody can say what will become of it?" To answer your question:

  • The spec is not "in turmoil".  It is undergoing the usual process of updating that happens to pretty well ALL standards.  The world has always been able to deal with standards that are improved.  You seem to be under the misimpression that a standard never changes.  On the contrary, they almost always do.  That's a normal part of the standards process.
  • You are technically correct that the ISO has not yet decided who will maintain the standard.  But, it is nearly certain that they will appoint the ECMA, who has volunteered to do this.  
  • There IS widespread adoption of OOXML.  Obviously, the people who have to make real decisions about its use (as opposed to just theorists like yourself), have no qualms about its long-term viability.
  • Anonymous
    January 17, 2008
    Nektar writes: "I don't understand." You are right, you don't. The documentation for the binary formats has been freely available since Office 97.  You just had to ask Microsoft for a copy. Many people and companies (IBM, Sun, etc.) have taken advantage of this over the years. The only difference now is that Micrsoft is making the process simpler.  Instead of asking for a copy, you will now be able to download it directly yourself. My guess is that you were probably just taken in by incorrect statements made by the anti-Microsoft folks about this matter.  That's likely the source of your confusion.

  • Anonymous
    January 17, 2008
    Who cares about doc, xls and ppt.  The big dissapointment is that it doesnt cover Visio .vsd and template formats

  • Anonymous
    January 17, 2008
    Hey this is great news! Can you please point me to where can I get the URL for the documentation of binary Outlook .PST files for Outlook 2003?

  • Anonymous
    January 17, 2008
    @hAi "GPL distributions are about copyrights on source code." Well, if you followed the discussions about GPLv3 and if you read the GPLv2, there are certain provisions for redistribution if you own a patent which is covering the software.

  • Anonymous
    January 17, 2008
    @Nektar InfoPath does not use a binary format for its data files; they are just xml.  An InfoPath solution template file (.xsn) is simply a cabinet file.  If you want to see what's inside, just change the .xsn to .cab.  You should now be able to open the file in Windows. You can also see an extracted .xsn by using "extract form files" in InfoPath 2003 or using "save as source files" in InfoPath 2007. See http://blogs.msdn.com/infopath/archive/2004/05/04/126147.aspx for more info

  • Anonymous
    January 17, 2008
    @Brian Jones, Congrats for making it to techmeme and slashdot. I'll get back to you when the dust settles. Way too crowded here right now.

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    Miguel, OSP GENERAL Q2: "You must agree to the terms [of the OSP] in order to benefit from the promise" http://blogs.msdn.com/brian_jones/archive/2008/01/16/mapping-documents-in-the-binary-format-doc-xls-ppt-to-the-open-xml-format.aspx GPL 3.0: Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License ... You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. http://www.fsf.org/licensing/licenses/agpl-3.0.html I think I cannot release a product under the GPL if it includes IP that requires the user to agree to the OSP in order to exercise his/her standard GPL rights to run, modify and propagate that work. You say that this does not matter and the program could "referencing the original source". But the the OSP specifically covers rights over "making, using, selling, offering for sale, importing or distributing any implementation", not just rights to use the documentation. So I think you are missing the point.

  • Anonymous
    January 17, 2008
    Please tell me honestly what about the binary format of MS Access with random bytes inside coming from memory dumps (!). For many of us it is a big problem.

  • Anonymous
    January 17, 2008
    >> January 17, 2008 1:29 PM. That first link to the OSP GENERAL Q2 should be http://www.microsoft.com/interop/osp/default.mspx#EYH Also the previous replies I referred to are from Brian as well as Miguel.

  • Anonymous
    January 17, 2008
    Re: Pete Austin First of all, let me say that I'm very encouraged that Microsoft are making some of their old/legacy binary file formats more available/accessible and possibly even usable. Even if some people are not satisfied (and I can probably be included as one of those), I think Microsoft should be congratulated and encouraged to positively reinforce this kind of behaviour. Thank-you Microsoft. I agree with Pete Austin's analysis and would hope that Microsoft can be persuaded/encouraged/cajoled into ensuring that  there is no question of there being legal problems with GPL (2 or 3) or BSD use. Perhaps dual licensing with OSP, GPL/GFDL and BSD or similar? I don't know if this covers old binary formats for Microsoft Project, Visio and Outlook - I hope so, as there is an awful lot of important legacy information tied up in documents in formats as well. I'll say again - thank-you for this step, and I hope is it the first of many in the same vein of open-ness. HopefulPedant

  • Anonymous
    January 17, 2008
    Mathew wrote: "Who cares about doc, xls and ppt.  The big dissapointment is that it doesnt cover Visio .vsd and template formats" It doesn't at present.  They can't do everything at once. But, I just tried a "save as" in Visio, and amongst the file types were some interesting ones:

  • XML drawing (*.vdx)
  • XML template (*.vtx)
  • XML stencil (*.vsx)
  • SVG (*.svg)
  • Compressed SVG (*.svgz)
  • Web Page (*.htm) That sounds like a good list to me, for someone who wants to build interworking solutions.  I could be wrong -- I'm not a Vizio wiz.
  • Anonymous
    January 17, 2008
    Dave S writes: "If the documentation has been so freely available, why hasn't any industrious hacker provided a fix for the current Excel vulnerability?" I don't follow security issues that much, so I don't know what vulnerability you are mentioning.  But, there is an obvious answer to your question:  Since you say it is easy to do given the documentation, why don't you do it yourself and report back to this blog on your fix?

  • Anonymous
    January 17, 2008
    Hi, The appearance is that Microsoft is trying to move in the right direction with this. I think that would be an awesome thing on a broad range of topics from developer effectiveness to the efficiency of the global economy. It genuinely appears that you are trying to move in the right direction here. But one thing confuses me. It seems that by using the OSP which does not confer redistribution rights, there is skepticism on the part of many potential developers. I think that skepticism may be unreasonable, or it may be based on past experiences - and that it doesn't necessarily matter which it is. The skepticism exists and is a hinderance if the real goal is to make these standards work for all of us. The basis of that skepticism seems to be that it is theoretically possible for Microsoft to stop distributing the documentation, and to enforce their copyright thus preventing others from redistributing the documentation. I find it difficult to believe that that is the real intent of Microsoft, so it seems like a silly thing to have redistribution stand in the way of this very worthy effort. So my question is, why not use some equivalent of CC-by-nc-nd? A license which allows non-commercial, attributed, unmodified redistribution of the documentation. Using a license like that would eliminate the (perhaps unfounded) fear that developers feel regarding the future availability of the documentation. Here's a link to CC-by-nc-nd: http://creativecommons.org/licenses/by-nc-nd/3.0/us/

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    I have to agree with "Anonymous Coward" that no matter how this ends, OOXML/MSOffice is going to be just what we've seen with HTML+CSS/IE. The HTML spec is a walk in the park compared to OOXML. Yet web developers (companies, individuals, communities) around the world are frustrated with the current situation where one implementation (IE) differs from the spec and forces developers to do double work, first by checking the spec and then checking the de facto reference implementation. Vast extra costs for those who would just wanted to create, innovate, provide information and content. Sure, IE7 improved things a bit (alas, broke also some sites) and IE8 is promised to get things even further. But why did this happen and Microsoft has finally forced to play by the book, as everyone else has been doing for a long time already? Only one reason: Firefox market share sky-rocketing in the past few years while  Microsoft showed nothing but complete lack of interest towards IE. I'm dumbfounded to read some of the naive comments how OOXML is somehow supposed to be vendor neutral. One can't create a vendor neutral standard when there's one vendor dominating the scene already. If the dominating application/vendor differs from the standard (not matter how much) then the rest of us just have only one option: to do the double work and check our implementation both against the spec and also for the quirks needed to be inline with the dominating implementation. Just thought about the previously mentioned HTML/IE saga again for a moment. There are two possibilities why MS did not to make IE compliant with HTML+CSS: 1) they were so incompetent, or 2) they did not want to. Nobody sane believes the first option, really. That leaves only one explanation: MS deliberately chose to broke the standard in IE in order to get and hold dominant market position. Too bad for them, they could have kept it for years to come but neglecting IE development for too many years backfired and let competitors get even to great benefit for all the Internet surfers. But it is important to realize here that IE seemingly followed the standards and those who were not familiar with these issues enough told that it is all good. But it was not, as already said. Web developers, competing browser developers, tool developers, etc. have suffered horrendously due to these "slight omissions" or whatever they were called. Without getting too sentimental here I must wonder how much better we, the humanity, could have used all those countless man-months what we've wasted by chasing the tail lights of one dominating but standard neglecting application? For the sake of all what's good in life, let's hope we don't need to spend next 5-10 years in that dark road again until we all finally understand that we need to built on collaboration and equality, not on something that is seen and taken as a godsend among those who don't know the history and are thus ready to repeat the mistakes already made.

  • Anonymous
    January 17, 2008
    Do folks feel that OpenOffice does a good job of following the ODF spec? -Brian

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    "The OSP is not sublicensiable, thus forbidding GPL distribution" Others have addressed this, saying that this is not an issue in this case (I don't know about that, or care). But I have a broader point to make on this, and that is GPL is the most unfriendly license in existence when it comes to "getting along" with other licenses.  Many OSI licences are compatible with almost any OSI license except GPL, and it's the GPL's fault for being so stringent.  And GPL fanboys demand that every other license bend to the GPL's guidelines; never once has a GPL fanboy admitted that maybe the GPL should bend to the other licenses guideline. My second point: Why is the ability to "sublicense" needed when anyone can get the license from the original vendor themselves?  If a vendor is giving "free" license to everyone, then sublicensing is not needed.  Yet this sublicensing issue seems to always come up as a reason for GPL-incompatibility, and it's just another of the GPL's red herrings.  Maybe GPL should be changed to read, "A license need not be "sublicensable" in order to be GPL-compatible if the license in question is freely obtainable from the original source", and that would solve this tired issue that comes up over and over.  Of course, it'll never happen, because GPL wants all licenses to cater to it, never the other way around.

  • Anonymous
    January 17, 2008
    Otherwise I agree with erik's lengthy post but two remarks:

  • minor: I definitely would not call it "chasing the tail-lights" when other browsers tried to mimic IE's non-standard behavior. Surely IE6 was a good product when it came out but (as said) its development was so stagnated that for few years there has been no IE's tail-lights to chase, only non-standard features to mimic in an inferior product (just think Safari, Opera, Firefox).
  • in general: I think only the time will tell how others do in this game. If, and only if, MS Office produces fully compliant OOXML by default rather soon after the final version OOXML is blessed by ISO (if ever) then I think things are much different than with IE/HTML. But if happens so that MS Office keeps to produce OOXML with those "slight omissions" etc then I'm also fearing that the history is repeating itself.
  • Anonymous
    January 17, 2008
    Brian, you probbaly have not seen it yet, but the latest blog from Jesper addresses this question, at least in regard to SVG: http://idippedut.dk/post/2008/01/Embrace-and-extend---SVG-revisited.aspx If you don't have time to read it in detail, just skip to the bottom where he shows SVG "images" in OpenOffice.  

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    Brian Jones posted yesterday about the availability of the docs for the binary file formats of Office

  • Anonymous
    January 17, 2008
    Ian, Great link, thanks for posting. I had heard rumors that it wasn't really true SVG support, but never had the time to look into it.


Bruno, It was even called the OpenOffice File Format up until a few months before the finished version 1.0 of the standard. :-)

Folk wondering about the OSP and GPL issues, The whole point of the OSP (and IBM’s ISP and Sun’s patent statement for ODF) is that they are not licenses, they are promises not to assert patents in specific situations.  Because they are not licenses there is nothing to sublicense.  Because they are unilateral promises, there is nothing anyone has to do or agree to in order to benefit from these promises.  The promise applies equally and simultaneously to the developer of the code, the distributor of the code and the user of the application that is an implementation of any of the specifications listed under these various promises.  In these instances no one has to provide anything to anyone because they have already been provided to them in advance. -Brian

  • Anonymous
    January 17, 2008
    Brian Jones posted yesterday about the availability of the docs for the binary file formats of Office

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    Brian Jones posted yesterday about the availability of the docs for the binary file formats of Office

  • Anonymous
    January 17, 2008
    Someone: Microsoft has already released a tool (Office Migration Planning Manager/Office File Converter) to search for and batch convert old binary files to OOXML. It'll probably be updated to ISO DIS 29500 when the format is approved. See: http://www.microsoft.com/downloads/details.aspx?FamilyId=13580CD7-A8BC-40EF-8281-DD2C325A5A81&displaylang=en Brian, aside from the SVG non-support, I can't say whether Star/OpenOffice does a good job of following the ODF spec. In fact, can anybody (outside of Sun?) If you'd really like an answer to that question, probably the best way to go about getting one would be to fork OpenOffice. Simply download the source code and replace all "OpenOffice/Sun" instances with "Microsoft." That'll sure invite scrutiny!

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    It might be that Microsoft will actually make specifications on how to read older .doc formats available. It might be that they will backpaddle, or simply release incomplete specifications. Let's hold the applause until we actually see a working implementation of a translator into ODF or Adobe .pdf that gets 100% of the document formatting right. And not just any implementation. Only a GPL implementation will do. Oh, and Miguel, I applaud you legal acumen. Where the official Microsoft website explains that it cannot guarantee that the terms of release will be compatible with the GPL since, as it claims, those terms mean different things to different people, you give us that assurance. Brilliant!

  • Anonymous
    January 17, 2008
    The comment has been removed

  • Anonymous
    January 17, 2008
    Seit längerem existiert die Möglichkeit, auch in älteren Microsoft Office Versionen (<

  • Anonymous
    January 17, 2008
    Just noticed some good news earlier this week on Brian Jones blog about new initiatives to make it easier

  • Anonymous
    January 17, 2008
    I notice that Access (mdb/adp) and that native data formats used other applications that come under the "Office" umbrella - eg Publisher, Visio, Project, OneNote - aren't included. Why is this?

  • Anonymous
    January 17, 2008
    Just noticed some good news earlier this week on Brian Jones blog about new initiatives to make it easier

  • Anonymous
    January 17, 2008
    Seit längerem existiert die Möglichkeit, auch in älteren Microsoft Office Versionen (< 2007) die neuen

  • Anonymous
    January 18, 2008
    Office Quiz I'm liking Ian Moulster's quiz on Office 2007 .  How many can you get - even I struggled

  • Anonymous
    January 18, 2008
    Office Quiz I'm liking Ian Moulster's quiz on Office 2007 .  How many can you get - even

  • Anonymous
    January 18, 2008
    Brian, "Do folks feel that OpenOffice does a good job of following the ODF spec?" Yes - I actually do think that. I think OOo sticks pretty much to the "technical reality" of ODF and the ODF-XML is pretty easy to understand and navigate through. I have three general critiques of OOo and the way it utilizes ODF:

  1. It really abuses (in the worst possible meaning) section 2.4 of the ODF-specification to a point where true interoperability is really hard to achieve. Just look at the values stored in settings.xml
  2. The naming of parts/objects in the ODF-package is really hard to figure out. Visual representation of embedded objects are stored as files with no extension and it seems that OOo quite often saves these visual "thumbnails" as GDI+-files which I personally believe is unnecessarily complex.
  3. OOo doesn't seem to enforce the package-reference-model by using the manifest correctly. :o)
  • Anonymous
    January 18, 2008
    @Ian, Try to keep up - www.microsoft.com/technet/security/advisory/947563.mspx The correct answer to my question is: no hacker will have a fix for this soon because the documentation has not been freely available. There are, apparently, no non-Microsoft parsers that can detect or repair the defect. An interesting point on the advisory is "Users who have installed and are using the Office Document Open Confirmation Tool for Office 2000 will be prompted with Open, Save, or Cancel before opening a specially crafted document that is attempting to exploit this vulnerability." Doesn't this apply equally well to -any- file opened through the ODOCT? Yes it does. www.microsoft.com/technet/archive/office/office97/downloads/confirm.mspx?mfr=true "Microsoft has released a tool that, once run, will require confirmation before opening any Office document (Word, Excel, PowerPoint, or Access) launched from within Internet Explorer" I do recall, however, that MS was previously unhappy about other organizations offering fixes to vulnerabilities. The concept that, with the documentation, the fix would be easy is an interesting one. How can it be that fixing a single problem is hard and implementing a complete application is easy? Aren't hundreds of implementations the current claim for MSO-XML? Hasn't one of those implementors looked at their own parser?

  • Anonymous
    January 18, 2008
    The comment has been removed

  • Anonymous
    January 18, 2008
    The comment has been removed

  • Anonymous
    January 18, 2008
    ... and then what will /. have to write about? NOTHING, I tell you! NOTHING!!! Wait, that would be a good thing, huh?! SWEET! Please carry on... Brian Jones: Open XML Formats : Mapping documents in the binary format (.doc;...

  • Anonymous
    January 21, 2008
    A bináris Office-fájlformátumokkal kapcsolatos hír újra felmelegítette az Open XML kontra ODF kérdéskört

  • Anonymous
    January 22, 2008
    The comment has been removed

  • Anonymous
    January 22, 2008
    Troy, Then don't use the format. Open XML is very clear in what it was designed to do. If you don't have a need for it, use something else. We were asked by the european commission years ago to submit our XML formats to a standards body. That's what we did. We then worked with Ecma on modifying it to ensure it was interoperable across platforms, and we had to change the Office product based on those changes. -Brian

  • Anonymous
    January 22, 2008
    Quiero compartirles lo que me parece una excelente noticia, que estoy seguro beneficiará a los desarrolladores

  • Anonymous
    January 28, 2008
    Brian: Worry not, I will use the format whenever I have need for it. I simply think this change, despite being a change for the good, is oversold in dramatic fashion despite being "very clear in what it was designed to do". Not necessarily oversold by you, but over sold here, yes. shrug

  • Anonymous
    January 28, 2008
    :-) I think that anytime you look at a technology that folks are trying to evangelize, it eventually will feel oversold to those following along closely. This is due to the fact that you still have so many people who are not yet fully aware and you need to reach out to them as well. I still have daily conversations (even with Microsoft partners) where they don't realize we've moved to an XML based open file format (let alone the standardization, etc.). -Brian

  • Anonymous
    January 30, 2008
    Le fantomatiche specifiche del formato binario di Office sono qui

  • Anonymous
    February 15, 2008
    Wow ... great one, here are the details - http://blogs.msdn.com/brian_jones/archive/2008/01/16/mapping-documents-in-the-binary-format-doc-xls-ppt-to-the-open-xml-format.aspx

  • Anonymous
    February 15, 2008
    Wow ... great one, here are the details - http://blogs.msdn.com/brian_jones/archive/2008/01/16/mapping

  • Anonymous
    February 15, 2008
    As promised last month , the binary documentation (.doc, .xls, .ppt) is now live. In addition to this,

  • Anonymous
    February 15, 2008
    As promised last month , the binary documentation (.doc, .xls, .ppt) is now live. In addition to this

  • Anonymous
    February 15, 2008
    Hace unos tres o cuatro años, no recuerdo por qué motivo, Microsoft decidió ofrecer bajo

  • Anonymous
    February 15, 2008
    The Microsoft Office Binary File Formats (.doc, .xls, .ppt...) are now available for everyone under the Open Specification Promise, OSP. This is good news for all of you working with the traditional b...

  • Anonymous
    February 18, 2008
    The binary documentation (.doc, .xls, .ppt) is now live. In addition to this, the project to create an

  • Anonymous
    February 18, 2008
    Brian Jones, Senior Program Manager just broke the news in his post today. Quoting from him: "As

  • Anonymous
    February 29, 2008
    Two weeks back we made a commitment to open the binary formats and place them under our Open Specification

  • Anonymous
    February 29, 2008
    Two weeks back we made a commitment to open the binary formats and place them under our Open Specification

  • Anonymous
    May 12, 2008
    I'm catching up with a bunch of Open XML blogging from ages ago, so apologies if some of these are old