Getting more information from the Word error box when troubleshooting OpenXML / WordML issues

So, many apologies for dropping off the face of the blogosphere lately.  Fortunately
(or unfortunately, depending on your perspective), I’ve been really busy at work. 
I’ve been working on some really cool things that I hope I’ll be able to talk about
publicly soon.  For now, though, I wanted to pass on something that I haven’t
seen documented in other places that actually helped me quite a bit lately. 

So, for those of you that generate Word documents via the OpenXML (or any other of
a variety of methods), you may have come across something like this when you opened
up a document you just generated:

image

There are some problems with this message but the big one is that it just says “Line:
1, Column: 0”.  Not exactly a map to the error.  As a result, you may have
stared at this message for a long time and wondered – “how the heck do I fix this? 
What is the real problem?”.  Well, let me show you a really quick and easy way
of getting more information than what is initially provided.

Step 1:  Change the extension from docx to zip

As you may or may not know, all OpenXML documents (or Office documents since Office
2007) are actually zip files at their core. That means you can just crack them open
and peer inside.

image

See the difference?  Easy!

Step 2:  Extract the zip file to a folder

Once again, pretty straight forward.  Once you extract the zip file above, you
should see a structure like the following:

image

Now, from here – you’ll be able to locate the file referenced in that cryptic error
message above. 

Step 3:  Find the file that’s causing the problem

In the example above, the message states that the problem lies with the file “/word/document.xml”
so just navigate to the “word” folder and find the “document.xml”.

image

Step 4:  Open and format the file in Visual Studio

One of the great features of Visual Studio is that it can format an XML file for you. 
So, in our case, the document.xml file is natively just one big line:

image

Incidentally, this is why the message always states “Line 1,…” in the error message. 
As far as Word is concerned, the problem IS on the first line.  Fortunately for
us, though, Word can take that single line file and format it for us.  Just use
the Edit > Advanced > Format Document option in Visual Studio:

image

That will then format the XML and make it look closer to:

image

Step 5:  Recreate the Word doc and get the additional information

Now that you have the file formatted appropriately, you can just re-create the Word
document and re-open it.  For this, just go back to the root of the document,
select all the files/folders and then zip it back up:

image

Once it’s zipped back up again, just change the extension from zip to docx and re-open
the file.  When you do so, you’ll see the following:

image

Note that now, you’ll see that it says “Line: 5667, Column: 0”
which will point to the exact line causing the problem – which allows you to just
go back to the “document.xml” file you already have open in Visual Studio to see the
problem.  In our case:

image

Note that this won’t magically fix your problem.  You’ll still need to examine
the WordML to figure out the problem – but at least you know where to go.  And
knowing is half the battle! 

That’s all for now and I will be back with some more developer stuff soon. 

Until next time!