The WordprocessingML Class: A refinement of the approach of using LINQ to XML to access Open XML
(July 10, 2008 - I've written a new blog post on a better way to accomplish this.)
This blog is inactive.
New blog: EricWhite.com/blog
Blog TOCThis post presents a refinement of the OpenXmlDocument class, which is a new class (WordprocessingML) that derives from the OpenXmlDocument class. The WordprocessingML class adds additional functionality that is specific to WordprocessingML documents, including:
· Some constant strings that contain the DocumentRelationshipType, the StylesRelationshipType, and the CommentsRelationshipType.
· An XNamespace object that contain the main XML namespace for WordprocessingML documents.
· Initialized properties that find the main DocumentRelationship object, the StylesRelationship object, and the CommentsRelationship object. The Relationship class is declared in the code found in the link below, and represents a node in the object graph that contains an entire OpenXML document.
· A DefaultStyle method that queries for the default style of the document.
· A Paragraphs method that enumerates all paragraphs in the document. The Paragraphs method returns IEnumerable<Paragraph>. The Paragraph class is a tupple class that contains: the XElement node of the paragraph for further querying if necessary, the style of the paragraph, the text of the paragraph, and a collection of comments for the paragraph. It needs to contain a collection because a paragraph can have more than one comment.
You can see the complete listing here: The WordprocessingML Class
Following is a simple example that shows the use of the WordprocessingML class:
string filename = "Test.docx";
using (WordprocessingML doc = new WordprocessingML(filename))
{
foreach (var p in doc.Paragraphs())
{
Console.WriteLine("Style: {0} Text: >{1}<",
p.StyleName.PadRight(16), p.Text);
if (p.Comments != null)
foreach (var c in p.Comments)
{
Console.WriteLine(" Comment:");
Console.WriteLine(" Id: {0}", c.Id);
Console.WriteLine(" Author: {0}", c.Author);
Console.WriteLine(" Text: >{0}<", c.Text);
}
}
}
When run on a small document, the code produces the following output:
Style: Normal Text: >This is a test.<
Comment:
Id: 0
Author: Eric White
Text: >Hello world
test
comment<
Style: Heading1 Text: >This is only a test.<
Style: Normal Text: >This is another paragraph.<
Comment:
Id: 1
Author: Eric White
Text: >another<
Comments
Anonymous
December 13, 2007
The last week has seen some interesting discussions and useful how-to posts on Open XML blogs ... ThreeAnonymous
December 13, 2007
This post presents a refinement of the OpenXmlDocument class, which is a new class (WordprocessingML)Anonymous
October 15, 2008
Interesting post, haven't used it yet but after reading this will give it a try. Thanks.Anonymous
October 15, 2008
Let me know how it goes. BTW, I've updated my approach - see http://blogs.msdn.com/ericwhite/archive/2008/07/09/open-xml-sdk-and-linq-to-xml.aspx for more info. -Eric