Frequently Asked Questions

Andrew Sayers had a great suggestion that I should have a page set up that gives an overview of the blog and answers the frequently asked questions that folks have about Open XML. Here is my first attempt, and I'll slowly try to add new material as things come up. I haven't decided yet whether there will be much content directly on this page or if instead I'll try to have a blog post that I just link to.

Who am I?

I'm Brian Jones, a program manager in Office. I've been working on the XML functionality and file formats in Office for about 6 years now.

What topics are discussed here?

I mainly focus on XML in Office and the Open XML File Formats coming in the 2007 Microsoft Office system.

What topics are not discussed here?

not sure

Why the name "Office Open XML"

When we first announced the new formats, we needed to have a name to encompass the Wordprocessing; Spreadsheet; and Presentation formats. We had initially called the Word format: "WordprocessingML" and the spreadsheet one "SpreadsheetML". When it came down to creating a name for all the formats, we thought the simplest would be to just call them the "Open XML formats." This was one of the big differences from the old binary formats so we thought it would be a good name. Then came the branding side of things that need "Microsoft Office" in front of everything, so the final version was "Microsoft Office Open XML formats." 

After we decided to hand the formats over to Ecma, we had to remove the word "Microsoft" from the name since the formats no longer belonged to us. This is much the same as the way OpenDocument Format used to be called "OpenOffice XML Format" before it was decided that it wasn't going to be tied to a single product (I think it was even called "StarOffice XML format" at one point).

Why isn't Office Open XML "proper" XML?

This is actually not true, although I have seen people make this mistake before. It actually is fully valid XML, it's just not the same architectural approach you've seen from the traditional DocBook model of document formats. Since folks do ask this question though, and here are a few themes I think folks are getting at when asking this question:

Why doesn't it follow the XML's deeper design model?

There are some people who've played with other formats like HTML or DocBook that are curious why WordprocessingML doesn't use that same model as either of those formats, and there's actually a pretty straightforward reason.

Why does it include non-XML metadata?

I'll need to fill this in. Here's what Andrew had proposed I write though: A few weeks ago, we talked about writing a post that outlined the little non-XML formats that go into documents. Personally, I found the comparison with a web page to be a fairly convincing justification, but I'd like to see a post explaining the whole issue with a complete list of mini-formats if you're going to mention in the FAQ

Why does Office Open XML look like previous Office formats, wrapped in angle brackets?

It works best this way. Office Open XML needs to be as compatible as possible with older versions of Microsoft Office's file formats, and the best way to do that is to use a format with a similar design model. Since there are several million Office documents for every developer that's ever worked on Microsoft Office, the only practical way of ensuring compatibility with that huge corpus is by making cautious, incremental changes. Switching over from a binary representation into an XML representation is the move we've made, and in talking with developers it's made dealing with the file formats much, much easier than before.

Why create a standardized file format

Note from Andrew: IMHO, there are a whole range of questions that belong in this heading, and they all actually mean "please give me a model I can use for explaining Microsoft's past behaviour and predicting its future behaviour, without resorting to elaborate conspiracy theories". IMHO, this is the most important question to answer in the entire FAQ

Why not use ODF?

This is a topic I've been coming back to since I started this blog back in 2005. I've talked about how the two formats differ, how we needed better support for formulas and for tables in presentations, design requirements that OpenDocument doesn't share (twice), and political considerations about why Microsoft couldn't take a seat at OASIS. The ODF folks have done some great work since I wrote those posts, and it looks like OASIS ODF 1.2 will solve a number of these technical problems. We do believe the two formats will exist though which is why we are investing in projects that enable translation between the two.

Why go to ECMA?

I covered this way back with my original announcement that we were taking the format to ECMA.

Why go to ISO?

Even though Office Open XML was already an ECMA standard, we felt that there was value taking it on to the ISO. This was mainly because we had been asked to by various customers (mostly governments). They wanted to have our formats in the domain of the international community.

What is your answer to GrokDoc's Objections to standardisation?

I'll need to create a seperate page for this as it's pretty detailed.

What can you tell me that will help me write better Office Open XML applications?

TBD

What are the issues around licensing?

TBD

What guides are available?

TBD