Some background information on the reasons we have moved to an XML format as the default in Office "12"

This continues to be a really good discussion and I want to thank everyone who has taken time to post their comments. I know there is a lot to read through and it can get a bit confusing at times, so I'm really glad to see so many of you are up for it. There were a number of comments in the last post where folks said that ultimately there was just a lack of trust in our motivations. That really concerns me so I wanted to get back to a discussion around our motivations. I think that after you see clearly why we are doing this work, you'll probably have a better time understanding where it's going and you'll see why we aren't going to "pull the rug out from under anyone." As I mentioned, Steven Sinofsky addressed this over a year ago: https://www.microsoft.com/office/xml/response.mspx

I think the first place to start is to quickly dispel a common myth I've heard a lot over the years. Some people seem to think that the file formats have some special importance in some kind of competition. It's just not true; at least it hasn't been true for a long time. You have to remember the Microsoft first started developing Office over 20 years ago. At the time, the binary formats were the state of the art. They were fast, small and optimized to take advantage of the feature set of the product. At the same time, they were brittle and not very good if you wanted to reuse the data in the document or attempt some kind of interop with another program. These limitations weren't intentional -- that was just the state of the technology. Corel and Lotus had the same issue. Over the years, we've continued to make some adjustments to those formats, but it's been very incremental and we've placed a high value on backward compatibility. The last time we made a breaking change to the binary formats of today goes back to the start of the Office '97 project (Jan '94 I believe). The main issue we were worried about at that point was that we wanted the format to mirror our internal memory structures so that it was easy to read and write portions of it to disk. We were still concerned with how document behaved when stored on a floppy disk, and performance when you had low memory environments. Documents behavior within business processes wasn't on our mind at all. The typical user in the early 90’s didn’t have a LAN and didn’t share docs all that much – they mainly printed. It was also standard procedure for apps to upgrade the file format with each release to enable new features to be saved. Since people didn’t share files much (other than in their printed form), the main scenario was a single user updating their own version. Of course old version files could be read in the new version, but there was no need to output the old version since you didn’t use it anymore now that you had the new one. In any event, this is hardly the behavior of a company that thinks it has some tremendous value in their particular format. To me, it seems more consistent with a company that cares about it's customer's experience...

For Office 2000, the internet wave had hit strong. We all thought that web pages were the documents of the future and we wanted Office applications to have the ability to save and edit those documents. So, we spent a ton of time (almost 25% of the overall dev budget) making it possible to save any Office document in HTML. Anyone who has used our HTML functionality extensively knows that it's hard to balance HTML simplicity with document fidelity. At the time we were very proud of our work here and couldn't wait to see it take off. I believe that we learned a lot from that experience. Our scenario was that people would start saving “docs” as HTML on their intranet sites and browse them with the browser. We viewed the browser as “electronic paper” that we had to “print” to (i.e. perfect fidelity). We had already got a lot of feedback from our Word97 Internet Assistant add-in that any loss of fidelity when saving as a web page was unacceptable and a “bug”. As it turned out, this usage scenario did not become as common as we thought it would and a zillion conspiracy theories formed about why we “really” did it. Many people assumed that a better approach would have been to save as “clean” HTML even if the result did not look exactly like what the user saw on the screen. We felt that the core office applications (other than FrontPage) were not really meant to be web page authoring tools, so we focused on converting docs to exact replicas in HTML. We didn't want people losing any functionality when saving to HTML so we had to figure out a way to store everything that could have existed in a binary document as HTML. We thought we were clever creating a bunch of "mso-" css properties that allowed us to roundtrip everything. HTML didn't take off in the same way we had expected, and today, the main use for Office HTML is for interoperability on the clipboard, though of course the biggest use is within e-mail (WordMail).

For Office XP (started in 1998), we really started thinking seriously about how Office documents were used outside of the applications. Of course the SGML folks (like Charles Goldfarb and Jean Paoli who has been working on this stuff for over 2 decades) have all been saying that for ages and they were right! We had spent so much time focusing on making it really easy to create documents, but hadn't thought a lot about what happens once those documents are created. This is one of the benefits we saw in a new feature called SmartTags. We not only wanted to give you useful actions to take on the content of your documents, we also wanted to make it possible to tag that content so that it could be leveraged by other processes. We also built the first XML file format in Office, SpreadsheetML. Folks on Wall Street and in Finance offices in particular had wanted ways to pull information out of their financial models. For equity research reports, they valued both the speed with which they could publish a report, and the accuracy of the data in the report. By storing their Excel models as XML, it made it easier to quickly pull data out without having to run Excel. This meant they could run their code on a server, and then use that data to verify that the information in the reports was accurate. This was really just the beginning though. The two big problems were that the SpreadsheetML format wasn't full fidelity (meaning not everything in the file could be saved as XML), and there wasn't an XML format for Word, which they used to generate the reports.

In Office 2003, we really started to gain a lot of momentum around XML. We had heard from a number of big customers that they needed XML support for their Word documents. People were trying all kinds of hacks on top of the Object Models to produce XML that they could work with. We had Wall Street firms with the need to integrate with XML more dramatically than we had imagined, so that they could do structured authoring with repurposable data. We had law firms that were trying to build solutions that could automatically generate legal documents based on data about who was involved in the case, as well as business logic around what pieces of content were required for that case. We also were getting a lot of demand for supporting other people's existing internal schemas. Not only did people want the Word document itself represented in XML, they also wanted to add their own XML markup to the files. Let's take a government office as an example here. Imagine they have a template that folks can use to submit to receive a permit. While it's nice that the formatting information can be represented in XML, they don't care as much about what's bold, numbered, or any other kind of random formatting. What they do care about is the name of the person that submitted the permit; what their address is; and what type of work they are seeking a permit for. Those things can all be labeled using content controls and custom XML.

It was this support for both reference schemas (SpreadsheetML and WordprocessingML) in combination with support for customer defined schemas (your own XML) that finally made it possible for the content of Office documents to play a role in business processes. We had moved from the world of the Office document being a black box that only had a small collection of meta-data scrawled on top; to being an open, interoperable, extensible, and extremely valuable piece of business processes.

At the same time, there are zillions of documents out there in older binary formats. We had to ask ourselves "who is going to take care to make sure those older document have a path forward?" "Who is focusing on doing the hard work to preserve fidelity between the new and the old?" We're doing that. We're making a deep investment in this compatibility to make sure our customers have a very good experience.

Now we move to Office "12". We are still building on the momentum we started over 6 years ago. Not only are we improving the XML formats so that they can represent every Word, PowerPoint, and Excel document out there, but we are making it the default format. We viewed this as something that we absolutely had to do this version. Office documents are so much more important as elements of business processes than we had initially been giving them credit for. You may have seen how we now talk about Office as a system. This is because it's no longer about the documents behavior in the application. It's about the entire document lifecycle. We have helped ourselves in all kinds of ways that no one has really thought about (or at least written about) yet. We can build smarts into Windows Sharepoint Services so that the server can actually look into the document, make decisions based on the document content, write data back into the document, all without having to run application code. We have a world where customers need to track and audit parts of documents that they never needed to do before.

We have customers in equity research who can't wait for these new formats with the content controls and custom XML support. The speed with which they will be able to publish their documents, while at the same time meeting the increasing regulation requirements is amazing. All the information within each research report is available to them. The system used to consist of printing out the report and having humans read through each one verifying the financial figures and making sure they had all the necessary disclosures. Now that can just be an easily automated piece of the larger workflow.

There is a customer (a bank) that we've been meeting with that generates documents on demand for all their loans. They are currently running Office 2000. These documents are built using smaller document fragments, and the logic for which fragments are used is based on the details of the particular loan. The data is then pushed into the document using the Word Object Model to find bookmarks and push the data into the relevant bookmarks. They do this in an automated fashion and turn out thousands of these documents a year. They currently have over 70 servers each with Word 2000 installed to turn these documents out in an automated fashion. Word isn't supported running in an unattended fashion, but they've decided to do it anyway (they didn't really have a choice). Now with the new XML formats and the support for custom defined schema, generating these documents will be a snap. It wouldn't even take up one full machine's resources. It will only need to consist of a small bit of code to handle the business logic. The code to build the document itself will only be a few lines.

The last example I have is one that benefits us in Office. Today, we have a couple thousand specifications that we've written for the Office "12" project. For each spec, there are a number of required sections that people need to fill out based on different processes we have for our design. The folks driving any of those processes need to be able to make sure that everyone has filled out the proper sections. When the files were all binary documents, we had to automate Word to be able to do this check. The automation had Word open the file, find the range of text for the specific section, and see if it was filled in. It would take about 8 hours to run the check across those few thousand documents. Because of this we only ran the check every couple of weeks, and it would have to kick off at night when folks were leaving and checked out in the morning. Often the check would fail, so we'd wait until the next night and run it again. At PDC the other week, I showed a similar collection of documents (actually it was only about 300). These documents were all stored in the new format though. I wrote a small about of VB.net (30 lines of code)that iterated over all those document and returned the author, counted all the paragraphs, and counted how many comments there were. To run that solution (which was already more complex than what we were trying to do internally) it took about 1 to 2 seconds. So, if I had increased the collection to 3000, it would have been at most 20 seconds (compared to 8 hours)!

We knew a long time ago that customers and the development community would ask what they could do with the new Office XML formats since they are specifically designed to address scenarios that go beyond the desktop. That is why we decided to take an open and royalty-free approach almost two years ago when we launched Office 2003. There has been a lot of back and forth in this blog on whether we went far enough and whether our motives are pure. It is sort of fun to question motives and pick apart licenses (personally, I'd rather be talking about the design of the formats), but I can tell you that our intent is to make the formats useful to customers and the development community. If we wanted to create a bunch of "gotchas" to trip people up, I think we could have done a better job.

A side benefit of this move is that now that we are creating a new format, we can do a lot of the other things our customers have wanted us to do within the binary formats for the past few releases (which we weren't able to do since we didn't want to break compatibility). Improved robustness; file size; and new features are all added side benefits. I already mentioned how Excel is now able to increase the limits on the number of rows and columns as well as other limitations they had when confined to the existing binary formats. We've also found that using ZIP and XML leads to a significantly more robust file. I've given demos where I delete whole blocks of bits from the files and we're still able to recover the remainder of the content. We see so many benefits to this new format, we often forget to mention all the best parts.

We've been fortunate to get a lot of great support from the public sector for our work. We’ve been working for many years now with governments to understand their needs with XML and they understand what we’ve been doing and our commitment to being open.

Massachusetts is obviously an interesting case and our competitors are having a lot of fun trying to turn this into a bigger story, but from what I've heard, I think some officials at the State were duped. There is no question that this licensing stuff can be really confusing. Just a few months ago, a government official from Massachusetts took a hard look at the Office XML program and publicly stated that his office found it to be "open" and fully consistent with the State's policies. Look here: (https://www.governmentciosummit.ca/GovernmentCIOLeadershipSummit page 23). For the most part, that announcement sort of inspired a yawn around here because our program had already been out for a year and had received a lot of good feedback from other governments. What happened after that? Well, the guy was deluged by lobbyists and influencers who told him that was a bad decision. People told him that the licenses were full of ghosts and scary shadows and bogeymen who would be bad for Massachusetts. It was tough to resist this line of argument because IBM/Lotus and Sun have a big presence in his State. The official himself had also been the CEO of an open source company just before taking office. So, just before he left office (yes, he just took off), he fired off his shot gun with this new policy while running out the door without really thinking through all the implications. Starting to get a picture of what happened? It's actually even a bit uglier that than, but I won't bore you with the details.

Anyway, that's life and we're going to work through the issue. We're already taking an open approach, so we are fundamentally supporting the vision of governments that are interested in open formats. There are also a bunch of smart people in Massachusetts who are trying to do the right thing and we want to work with them in a constructive way. That's our plan.

As usual, I welcome your feedback.

-Brian

Comments

  • Anonymous
    September 29, 2005
    Quote Bryan :"There are also a bunch of smart people in Massachusetts who are trying to do the right thing ...".

    It’s simultaneously strange, amazing and utterly incomprehensible that your publications continue to write articles about Microsoft XML that appear to elevate and lend credibility to their intend. It's almost as if I am reading something coming out of the Pyongyang press.
    The process in Massachusetts that resulted into the discussion was done methodically, transparent and well documented, indeed also by smart people over two years!

    One issue involved in Massachusetts was the undocumented binary key which I believe binds the presentation aspects of MSXML to the MS Windows-Application.
    In addition your Microsoft scheme is encumbered with patents and vendor centricity. The license language is vague.

    If Truth and Honesty is characterized by simplicity, clear words and clarity than Microsoft is far from that!
  • Anonymous
    September 29, 2005
    Patrick, I'm not sure what you are refering to about an undocumented binary key. Which key? There is no "binary key".

    The licensing issue is confusing for a number of people, including myself. Someone from Sun left a comment on my last post accusing me of spreading FUD when I asked some questions about the Sun license. I think that if you look at all three formats we've been discussing: MS Office; OpenDocument; and PDF; there are a number of questions.

    There has been FUD spread about the Microsoft Office Open XML formats for the past 3 months. Our license is pretty simple and straightforward, but it is still a legal document and has a lot of "legalese" in it. I haven't read through the PDF stuff in enough detail to get into that. In reguards to OpenDocument, based on the comments left by the guy from Sun, it sounds like they don't believe they have IP so they have chosen to not provide a license and instead promise they will provide one if IP turns up. That seems like a wierd response to me. If they don't have any IP, why don't they just provide something legally binding to the developer community.

    -Brian
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    I love this kind of "this is what the world was like then and this is what we were thinking about when we did XYZ." It's like looking over the shoulders of Raymond Chen and Larry Osterman about how things get the way they are and everything that is done not to get in the way of people bringing their work and applications onto a newer system.

    Also, Chris Pratley did a cool thing. When he chimed in on the previous thread, he left a link to a great article he posted 17 months ago before all this hoo-hah about whose XML is opener than thine. Here's the link in case it snuck by anyone: http://blogs.msdn.com/chris_pratley/archive/2004/04/28/122004.aspx
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    I did a little fact checking on the Fact News article that Dave mentions: http://www.foxnews.com/story/0,2933,170724,00.html

    I don't know what has it be called an editorial. But it is by James Prendergast and at the very bottom it says that "Jim Prendergast is executive director of Americans for Technology Leadership."

    On the ATL web site, the "About Us" tab tells you exactly who supports the organization. http://www.techleadership.org/about/ It's an interesting mix, and Microsoft is one of the boosters.

    There is also sponsorship by Citizens Against Government Waste, an organization mentioned in the article too. At their "About Us" page, http://www.cagw.org/site/PageServer?pagename=about_Mission_History, you learn they've been around for a while and that they have origins in the Grace commision that was created precisely for that purpose. Jack Anderson, the columnist (anybody here old enough to remember Drew Pearson?) is the surviving founder of the organization.

    I also checked on the Americans for Competitive Technology, a very large advocacy organization. Microsoft is there too, along with Oracle and a lengthy list of others.

    So whether or not you agree with the positions and assessments by executives of these organizations, I don't think there is anything being hidden.

    I do think you are seeing a legitimate expression of differences in philosophy about government operations and the relationship of that to the marketplace. YOu can disagree with that but that doesn't mean they don't have that seriously-considered point of view.

    And, in case it matters, I thought it was counterproductive for Microsoft to object so loudly in this instance. If the move by Massachusetts is ill-conceived, events will bear that out. Technical fiats by government (including the declaration to use ASCII in the LBJ administration) tend to be ignored by people who actually have to do the work. Life seems to just go on. That has nothing to do with whether or not the fiat is a good idea, though I think the Massachusetts policy is premature at the very least.
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    The comment has been removed
  • Anonymous
    September 29, 2005
    Also, Brain (and orcmid, who seems well versed in these matters) a response to the latest move by Sun in opening up their contribution to OpenDocument would be interesting:

    http://blogs.sun.com/roller/page/webmink/20050930
  • Anonymous
    September 30, 2005
    Brian, it seems you have a hard time understanding why some people seem to be opposed to any file format Microsoft proposes for use as a global document standard (de-facto or otherwise). I'll try to explain my view:

    Like you explained this is the first time we have the possibility of non-binary document formats that could be used over multiple applications and systems. I feel that keeping these formats as free as possible is of utmost importance -- to ensure document availability over long periods of time and over different systems AND to ensure competition in a market that really hasn't seen a lot competition recently.

    Whit this in mind, which would I rather choose:
    * format A): developed mainly by a company that has a monopoly-like position in the market. The same company has a history of questionable moves regarding standards and interoperability (Kerberos, DR-DOS, SMB, Outlook calendar formats)
    * format B): developed by an international consortium with thousands of members. The main implementations so far have been by open source projects.

    Can you honestly not understand why option B, while not perfect*, just sounds so much better?



    *) I do understand that a standardized format is always slower to get new innovations (we might even have to drop some functionality from current programs), that applications might have to do things in unoptimized ways and that development might be more expensive/slower. These are costs I'm ready bear.
  • Anonymous
    September 30, 2005
    MS plans to work with Massachusetts?

    Is supporting a "third party, public - minded" organization to go after MA like an attack dog part of "working with?"

    If so, that's a brand new definition.

    Please contact Oxford to make your definition the official one.
  • Anonymous
    September 30, 2005
    Dave, thanks for the link.

    Now, we can leave SunMink to decide whether or not the new declaration by Sun is a license or not and whether the previous declaration was tantamount to a license.

    (I think Brian was on thin ice in asserting that a crafted license statement was required, though it would certainly be nice to have had clarity like that available. In any case, I think that point, which could have been addressed on its simple merits was no grounds for a claim of 100% FUD, an observation that I find shrill, basically ill-mannered, and rather closed-minded.)

    I also think the actual Sun announcement needs to be examined. You'll notice that it still applies solely to implementations of the OpenDocument 1.0 (and successor) formats, and only for those specifications that Sun participates in (and I'm probably overlooking whatever subtlety "OASIS rules" introduce here).

    This is still very close to the Microsoft terms that we've been arguing about here and elsewhere, except Microsoft's defensive qualification does not involve reciprocity. The Sun declaration is also a promise to a future that it might be difficult for Microsoft to make (since they are the sole authors and custodians of the Open Office XML Specifications) in a way that is given as much credibility. (I.e., Microsoft is expected to defect even though Sun can defect more easily -- at less cost in the marketplace -- and either way the parties would have customers and themselves disrupted in unpredictable and negative ways. I actually trust Microsoft more in this regard simply because they have so much more to lose and they are extremely attentive to customer reaction no matter what we say about how they have dealt with OEMs and competitor access to the platform. They live under a microscope and from my view, that is working out.)

    There are other statements made by SunMink about this new Sun statement that I simply can't find evidence for in the actual statement on the OASIS site. Simon might be reflecting Sun's intention, I just can't find it as a direct interpretation of the actual statement.

    Maybe the safety of other people's extensions comes in the fact that the OpenDocument specification has two flavors of schema: strict and not-strict and there appears to be room for lots of private customization (you know, like in Kerberos --oh naughty Microsoft -- or say, Star Office -- if one wants to be an equal-opportunity distruster.)

    You'll also notice that the new Sun statement is a good example of a reciprocity deal concerning patent licenses.

    Now, is this good news and good work? You bet. The original IPR statement was sloppy and this is pretty direct and clear, even with all of those words that lawyers have to put in things.

    I think Sun just did a cool thing and it is great that they'd been working at it all along. I think all of the attention to the license question, and threads like this one in various venues have had an impact on that all the way around. This seems like pretty positive progress to me.
  • Anonymous
    September 30, 2005
    Your move, Monopolysoft:
    http://blogs.sun.com/roller/page/webmink?entry=raising_the_bar_on_patents
  • Anonymous
    September 30, 2005
    The comment has been removed
  • Anonymous
    September 30, 2005
    Brian,

    You said "Just a few months ago, a government official from Massachusetts took a hard look at the Office XML program and publicly stated that his office found it to be "open" and fully consistent with the State's policies. Look here: (http://www.governmentciosummit.ca/GovernmentCIOLeadershipSummit page 23)."

    I looked there the presentation and found that the slide on page 23 says:

    Demonstrated sustainability will get people to
    listen:
    Microsoft changes their Office 2003 license:
    Patented XML documents can be opened and read by any reader
    License is now in perpetuity

    Could you please explain how this could possibly mean "a government official from Massachusetts ... publicly stated that [Office XML program] (is) fully consistent with the State's policies"

    Thanks.
  • Anonymous
    September 30, 2005
    I'm looking forward to the day when all Word versions will generate XML by default. I work for a company that creates medical transcription systems. The hoops that we jump through to mine information from Word documents is mind-numbing. Plus, there's a significant overhead in just driving Word to get the information we need.

    It's great that Microsoft is being open-minded in sharing its XML schemas with the world. Considering the current Word document format is totally closed, you're definitely heading is a direction where outside developers can add value.

    I think the HTML formatting decisions of the past were unfortunate. There are two ways of looking at documents: 1) format centric. or 2) content centric. There been several occassions where I would have loved to use parts of documents in some web pages. After looking at the html that was generated, I had to do my own cleaning. In those cases, I wanted to just content and not the format.

    In our company, we are defining our own document schema and transforming the Word XML docs along with additional system data. It seems to me that people who want a specific XML schema format really should be working on XML transforms and not Microsoft product transforms.
  • Anonymous
    September 30, 2005
    The comment has been removed
  • Anonymous
    September 30, 2005
    Groklaw on the Massachusetts decision:

    http://www.groklaw.net/article.php?story=20050929134232923
  • Anonymous
    September 30, 2005
    The comment has been removed
  • Anonymous
    September 30, 2005
    Brian, you are still ducking the issue. Why did Microsoft decide to make the Office XML license a good deal less open than Massachusetts wanted, a good deal less open than the OpenDoc license? What were Microsoft's motivations?

    Maybe you don't know. You are a techy, you don't make these sorts of decisions. But at least address the issue.
  • Anonymous
    September 30, 2005
    ...just who do you think you're fooling? Your name is on the patent, and you don't know how it's licensed? Do you know how the royalties will be distributed?

    You say you get confused about the license and you haven't thoroughly read it, but you are still going to assure me that it's open? Perhaps I can simplify some effects of the license: Suppose I want to write a program that can read a document in the MSXML format without being bound to royalty payments. I can.
    Suppose I want to write a program that can write to the MSXML format without paying Microsoft royalties. I can't.

    As an inventor listed on the patent, you have millions of reasons to dance around the license issues. So dance all you want, you aren't fooling anyone.

  • Anonymous
    September 30, 2005
    The comment has been removed
  • Anonymous
    September 30, 2005
    Gee, it's nice that Brian left all this chalk lying around while moving on to the kind of topic he really wants to address ...

    1. Pablo, I think you're making things up. It would be quite unusual for an inventor of a patent owned by a large corporate employer to have anything to do with the management of the licensing of that invention. And likewise having anything to do with externally paid royalties. I don't mean that Microsoft employees don't receive awards for successful patents of their inventions, I would assume that they do. But that's a different deal.

    2. Now about reading the Microsoft Open Office XML formats and not being able to write them. Where do you see that? I'm looking at http://www.microsoft.com/mscorp/ip/format/xmlpatentlicense.asp

    2.1 It says, in the first paragraph, "The purpose of this document is to provide a patent license to individuals and organizations interested in implementing software programs that can read and write files that conform to such specifications."

    2.2 A little farther down it says "Except as provided below, Microsoft hereby grants you a royalty-free license under Microsoft's Necessary Claims to make, use, sell, offer to sell, import, and otherwise distribute Licensed Implementations solely for the purpose of reading and writing files that comply with the Microsoft specifications for the Office Schemas."

    2.3 Where does it say no writing?
  • Anonymous
    October 01, 2005
    The comment has been removed
  • Anonymous
    October 01, 2005
    David Berlind, at ZDnet has provided a nice column on the new license and tied it to the discussions over here: http://blogs.zdnet.com/BTL/?p=1949

    From my perspective, it is a pretty balanced piece. In particular, David seems careful to avoid suggesting that any confusion in wrestling with and discussing this topic is malicious. I don't think it is, I think it is our working to build an understanding of the situation and the unfamiliar territory of licenses and intellectual-property legalities.

    I recommend the piece.
  • Anonymous
    October 01, 2005
    The comment has been removed
  • Anonymous
    October 01, 2005
    Actually, I do have one last thing to add. Orcmid, I think you missed the point on sublicensing. Naturally, nobody would ever give the rights to sublicense patent rights under terms of the licensee's choice. The sublicense rights that have been discussed here are simply the right to grant to another the same licence that was granted to you. For a license that's perpetual anyway, and should presumably be available forever, this seems pretty trivial. It would not even really necessary if the license simply said it'd be available forever under the same terms.

    It all seems such a minor sticking point (though one that it wouldn't hurt to have resolved, as I've tried to explain earlier).

    I also think you're mistaken about the GPL not getting in the way of the use of this patent license. The sticking point is really section 7. It could be argued that you're not obliged to be able to guarantee that the required patent license always be available - you just can't distribute the program anymore if it becomes unavailable and specific claims are asserted, even though you yourself still have a valid patent license. The rights display clause looks like more of a problem, in that it imposes additional requirements you're not able to impose in turn on your licensees.

    It seems an even sillier sticking point to be held up on, more's the pity. I didn't even choose the license for the software I'm particularly interested in supporting these formats with.

    Oh well. I hope the Office team keeps up with the interesting and really useful-looking work, and I'll continue to hope that these things - that I see as issues - might get sorted out in the future.
  • Anonymous
    October 01, 2005
    Thanks Craig, that clears up a lot.

    I agree that the patent license can be simplified in ways that would make it all that much easier. I think Sun's approach might be a good stimulant to that.

    I know for me as a developer, it makes my life less worrisome, and it gives me a simpler way to share my work with others. (I think I would still include notice, because it reminds people about the limitation to particular use that still live in these licenses.)

    I don't think the GPL works the way you see it, but it's hardly a showstopper. Neither the OASIS Open Document specs nor the Microsoft Open Office XML Reference Schemas permit derivative works of the specs or the schemas to be created and distributed (although there is a limited exception for writing guides and tutorials about Open Document), so that's sticky regardless of whether GPL section 7 applies here.

    That really doesn't limit the writing of software that uses the format (and it prevents forking the format, which I imagine is part of the reason for the limitations), and I expect there'll be enough for people seriously interested in doing so to use either and both formats.

    But those are details. I agree the direction is positive and there is more experience and practice to have over time.
  • Anonymous
    October 01, 2005
    Man, I looked at that stuff way too quickly.

    1. On the new Sun Patent Statement for the Open Document Format. They took out reciprocity too. Now there is a standard defensive clause, basically identical to the one in the Microsoft Royalty-Free license.

    2. And, as long as I'm here, I need to clear up something else with Craig (I would e-mail Craig, but I'm not sure how to do it). I didn't mean to suggest that Craig wanted Office to use ODF as its default. I was going on about claims I encounter that it should be easy to support ODF and, as some imply, have roundtrip fidelity. There are people who claim Microsoft should simply go native with ODF, and I find that clueless. I am not aware of Craig suggesting anything like that.

    I wasn't thinking of Craig in my comments about any of that.
  • Anonymous
    October 01, 2005
    The comment has been removed
  • Anonymous
    October 01, 2005
    Thanks for your comment orcmid. You're right in that I've never thought MS should natively adopt OpenDocument - I don't see what's in it for them, and I'm not convinced it's mature at this point anyway.

    When it comes to the GPL, I never meant to suggest that the patent license should somehow be convertable to the GPL. My concern is solely with software that wishes to implement the formats. There, I don't think you're right that it's a non-issue - I suspect the notice clause will cause a problem, namely additional restrictions on distribution. I'd personally want to add it anyway - credit where credit's due - but it'd be nice if the license said "should" not "must", or even required the inclusion unless prevented by other licensing requirements. I do think that as expressed above I was mistaken in my earlier concern about sublicensing re GPL compatibility. I still think it's well worth resolving, but I don't think it prevents a barrier to implementations under the GPL as I initially thought it did.

    I certainly understand the strong desire to retain control of the format. I'd be worried if it was any other way. PDF and OpenDocument are the same, in that the rights to implement the formats and use any associated patents are open to all comers, but the copyright on the specification its self is much more restrictive. I would be very uncomfortable if I needed to work with a spec where anybody could modify it and release it as an "official" version - "PDF 1.99, Joe Average's release". No, thanks. Trademark protection on the specification or format name would help, but even so I just don't see the point in letting others make their own versions of the specification its self.

    As for mail, if anyone wants to mail me I can be reached at craig@postnewspapers.com.au . The address is, alas, already all over the Internet, so there's hardly any point in hiding it.
  • Anonymous
    October 01, 2005
    Thanks Eduardo, I read the Groklaw material that you recommended.

    I have already presented my analysis about the "no writing" assumption in an earlier response. I believe the lawyer took the passage completely out of context, and even then it doesn't say no writing. In the context of the overall license, I take that supplementary clarification as identifying a "more-reading" case that arises in conjunction with government operations and public records.

    I thought the lawyer's advice about how this works when one is working on licenses between parties was pretty great. When someone provides a nonexclusive blanket license in advance, things don't quite go so nicely.

    I am also not making an argument that one organization is more trustworthy or equally trustworthy to another. I said that there is nothing that Microsoft can do to satisfy you unless there is a basis in trust to satisfy you or anyone else who starts from a position of distrust. I'm also not asking you to change your point of view.

    Finally, comparing Microsoft with a so-called standards organization doesn't do much. They don't need to be trusted for the same things. OASIS doesn't deliver, support, and maintain running code.

    If OASIS did a verification/conformance suite on the format, that would be cool though. We are going to miss there being one of those. Real soon now.
  • Anonymous
    October 01, 2005
    The conversation with Eduardo got me thinking about distrust and how it is a non-starter. Here's a thought-experiment (you can try it on yourself for real, based on your level of distrust of Microsoft or of X).

    Suppose you distrust X.

    Now, state a condition of satisfaction, S, that if X were to promise to satisfy it by some certain point in the future, say T of 12 months, you would be willing to conditionally trust X to perform that and withhold judgment until T arrived. And you have to hold yourself to it. That is, you have to be trustworthy not to change the rules.

    Depending on the outcome, you might enter into a new condition, one involving more or less trust depending on what happened.

    Life will happen and you may have to adjust this in reality as events transpire, but lets assume starting out that both you and X will do your best in that regard. That's part of the deal.

    OK, what is it the least very specific and limited S you would be willing to commit to being satisfied with and be willing to conditionally trust X to perform if they accepted your conditions?

    I think this is an instructive exercise. I think that the higher the barrier to X's performance S is, the greater is the distrust you harbor, and your denial of X an opportunity to build trust with you.

    That's my theory. You can write to me about it.
    - dennis.hamilton@acm.org
  • Anonymous
    October 01, 2005
    The comment has been removed
  • Anonymous
    October 04, 2005
    The comment has been removed
  • Anonymous
    October 04, 2005
    To BrianJones:
    I hope you don't think I was being rude. I did not mean to be. But, I certainly am skeptical. If someone says something incredible, then I don't see why they should be surprised at skepticism. I was not getting Word documents in the early 90's by email. But, I and an awful lot of very untechnically oriented people were using BBS's and exchanging documents. That went back to the early 80s for me.

    Thank you for having this dialog. I hope both side get some sense knocked into them.

    To orcmid:
    Trust is like that. I don't see any special treatment for or against Microsoft here. When Microsoft has lied so often in the past, it should not be surprising that people doubt the veracity of their statements.

    I see a license that is more difficult to read than it should be. It hides terms in places I can't read, and then it spells out that a program to read would clearly be licensed. What about a program to write? If they really wanted to be open, that could just as easily said read or write. But they did not, and the doubt continue.

    Having read a number of Microsoft licenses(I read EVERY license before I agree to it), I know Microsoft has a history of legalese that seems intended to keep unreasonable rights for themselves. Only their monoploy position has allowed those to stand. If the markets were competitive, an alternate provider could take away substantial market share, just based on licenses. So, I see a Microsoft license that appears to grant me the rights to write a program to read and write their format. But, if I want to distribute my program there is no certitude that the people who get my program will be licensed to use it. And if my program does something Microsoft does not like. I expect them to claim that my program does not properly implement and coerce me to withdraw it. I envision some specific utilities that users would appreciate, but Microsoft probably would not. I can't even readily read the schemas, because they are not published in a way that any of my operating systems can read. I have 2 Windows 2000 machines, but I can't take them past SP1 due to the crippling effects of later fixes. Adobe publishes the PDF spec quite openly and does not require that I use an Adobe product to read it. OASIS publishes their spec openly. I think I see a pattern here. So, why, based on what I do see, would I believe that the formats are open?
    If Microsoft is being honest here, as they may be, then it will help regain trust. But, just as it took time to lose that trust, it will take time to regain it.
    At times, I wish I had not switched from Microsoft OSes and office products. But Microsoft forced me to leave because of their obnoxious terms and behaviour(It was never the money, for me). Since I can't in good conscience install any current Microsoft products, I use Linux. And I run into more and more people who switch for the same reasons. If you really want to stop driving customers away, then do this right.
  • Anonymous
    October 05, 2005
    Hey Ralph, you should be able to get a ZIP version of the Office "12" schema previews here: http://www.microsoft.com/downloads/info.aspx?na=46&p=2&SrcDisplayLang=en&SrcCategoryId=&SrcFamilyId=15805380-f2c0-4b80-9ad1-2cb0c300aef9&u=http%3a%2f%2fdownload.microsoft.com%2fdownload%2fb%2f5%2fb%2fb5b64679-4d6b-43ec-ba50-5891ca11cf15%2fOffice12XMLSchemaReference.zip

    Hopefully that will work for you.

    I also wanted to apologize to folks for getting a bit aggressive towards the end of this post. I'm really proud of the work we've done over the years to open up the Office formats, and I got a bit frustrated with all the politics. Happens to everyone I guess. :-)

    -Brian
  • Anonymous
    October 05, 2005
    Brian, now really after three pages of people pointing you to the places you clearly disseminate FUD in your own blog, when will you stop and reconsider?
    Will you answer the real questions posed to you in a complete and clear manner? It is relevant for your business and customers because OOo and the OpenDocument Standard is a direct competitor to your product. You should stop being in denial about OOo et al. and pitch at least against it clearly.
    If you won't answer clearly and completely, but instead keep FUDing, it's either you are incompetent or malevolent to the audience here.
  • Anonymous
    October 06, 2005
    Thanks for the comments "occasional developer and power user", and I hope you don't take offense to this, but it really does dissapoint me that some folks here are using the term FUD every time I raise an issue that they don't agree with. I'm not saying that's what you're doing necessarily, it's just that the term FUD is used to such a ridiculous level at this point it's hard to really give it any weight. Is that a common practice now that some folks just dismiss questions they don't like as being FUD? Every question I've asked has been based on a lack of clear information. People have been upset about our licenses, but when I went to find the licensing information behind PDF and OpenDocument there was definitely a lack of clarity (to say the least). Since I asked those questions the OpenDocument folks provided some updated documentation that appears to be much more clear, so that's good. Just because I was trying to better understand the situation doesn't mean I was spreading FUD.

    I've already been really clear on why we can't use OpenDocument for our default format. I think most people out there understand that. Our customers would not stand for it. We need to fully support all their existing documents. Remember also, that when I say customers, I'm not necessarily referring to all of you reading my blog. We have a huge customer base (400 million users), and I would say less than half a percent of those folks probably know anything about OpenDocument. If their documents started behaving oddly because we'd moved to a format that was still undergoing changes and did not fully represent everything in their documents, that would just be unacceptable.

    Some folks are wondering though why we don't provide it as an option format. That really just comes back to this issue of there not being a lot of customer demand. I know that will upset a lot of folks reading my blog but you need to understand the difference between newsgroup and blog demand; and actual customer demand. We get asked to do new features all the time. Every major customer has collections of features they come to us with, and we can never do everything. We need to weigh all the requests and decide what we can do that meets the broadest set of requests. How many documents are out there right now in the OpenDocument format? Aside from the commonwealth of Mass. and a handful of blogs I haven't heard anyone that thinks it's important for us to support OpenDocument. All of the scenarios folks have around XML are more than met with the Office XML formats in combination with customer defined schema support.

    Some people seem to be under the impression that we're somehow doing work to not support OpenDocument. Anyone who works on software would know that it's a lot of work to build in support for a format, and even more work to maintain that. We have this with the WordPerfect converters for instance. There were a lot of WordPerfect documents out there when we built that feature, that's why we built it. It was something that we had to do in order to get people to buy our product (just like OpenOffice has to support the Office binary formats today). If there hadn't been that demand though, we wouldn't have built it. If at some point there is a lot of customer demand and there are a lot of documents out there in the OpenDocument format, then I wouldn't be surprised if we built support for it.

    That gets me to my last point which I've raised a number of times. Office is a very extensible tool. Anyone can come along and build an add-in to support OpenDocument. If there is enough demand, that will happen.

    -Brian
  • Anonymous
    October 09, 2005
    Brian, if I write this, it's because I like you keeping coming.
    Now, it is understandable, that you say your company is doing what it's customers demand most. (I wonder if they did ask that often for IE in the year 1995 or for the media player later :-)
    But you see, the question is not, if 99% of your customer base need OpenDocument, it is if Microsoft is willing to show to the public a document format, that can be used freely by everybody (any software vendor too), and thus be acceptable for educated choice of the public authorities.
    This is how it works - you design a patent encumbered format with unknown licensing future, and it gets denied by Mass. because it is not what is required for public good. Instead, OpenDocument is being chosen.

    Your concerns about OpenDocument were clarified in a few days time. And this is how information sharing works.
    On the other hand FUD is what you say to explain why it is not so.

    You see what you are missing now in your picture?
  • Anonymous
    March 27, 2006
    Links to blog posts that contain useful technical information for developers.  Open XML is a new standard, but there's some good information already available if you know where to look.
  • Anonymous
    April 25, 2006
    comments faire pour avoir publischer alors que je l'ai sur mon

    PC.

  • Anonymous
    April 25, 2006
    de l'aide pour avoir sur mon pc le. MS office 11
  • Anonymous
    June 05, 2006
    As we move forward with the standardization of the Office Open XML formats, it's interesting to look...