WordprocessingML Document with Styles
More complicated WordprocessingML documents have paragraphs that are formatted with styles.
A few notes about the makeup of WordprocessingML documents are helpful. WordprocessingML documents are stored in packages. Packages have multiple parts (parts have an explicit meaning when used in the context of packages; essentially, parts are files that are zipped together to comprise a package). If a document contains paragraphs that are formatted with styles, there will be a document part that contains paragraphs that have styles applied to them. There will also be a style part that contains the styles that are referred to by the document.
When accessing packages, it is important that you do so through the relationships between parts, rather than using an arbitrary path. This issue is beyond the scope of the Manipulating Content in a WordprocessingML Document tutorial, but the example programs that are included in this tutorial demonstrate the correct approach.
A Document that Uses Styles
The WordML example presented in the Shape of WordprocessingML Documents topic is a very simple one. The following document is more complicated: It has paragraphs that are formatted with styles. The easiest way to see the XML that makes up an Office Open XML document is to run the Example that Outputs Office Open XML Document Parts.
In the following document, the first paragraph has the style Heading1
. There are a number of paragraphs that have the default style. There are also a number of paragraphs that have the style Code
. Because of this relative complexity, this is a more interesting document to parse with LINQ to XML.
In those paragraphs with non-default styles, the paragraph elements have a child element named w:pPr
, which in turn has a child element w:pStyle
. That element has an attribute, w:val
, which contains the style name. If the paragraph has the default style, it means that the paragraph element does not have a w:p.Pr
child element.
<?xml version="1.0" encoding="utf-8"?>
<w:document
xmlns:ve="https://schemas.openxmlformats.org/markup-compatibility/2006"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:r="https://schemas.openxmlformats.org/officeDocument/2006/relationships"
xmlns:m="https://schemas.openxmlformats.org/officeDocument/2006/math"
xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:wp="https://schemas.openxmlformats.org/drawingml/2006/wordprocessingDrawing"
xmlns:w10="urn:schemas-microsoft-com:office:word"
xmlns:w="https://schemas.openxmlformats.org/wordprocessingml/2006/main"
xmlns:wne="https://schemas.microsoft.com/office/word/2006/wordml">
<w:body>
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Heading1" />
</w:pPr>
<w:r>
<w:t>Parsing WordprocessingML with LINQ to XML</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" />
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0">
<w:r>
<w:t>The following example prints to the console.</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" />
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
<w:r>
<w:t>using System;</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRPr="00876F34" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
<w:r w:rsidRPr="00876F34">
<w:t>class Program {</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRPr="00876F34" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
<w:r w:rsidRPr="00876F34">
<w:t xml:space="preserve"> public static void </w:t>
</w:r>
<w:smartTag w:uri="urn:schemas-microsoft-com:office:smarttags" w:element="place">
<w:r w:rsidRPr="00876F34">
<w:t>Main</w:t>
</w:r>
</w:smartTag>
<w:r w:rsidRPr="00876F34">
<w:t>(string[] args) {</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRPr="00876F34" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
<w:r w:rsidRPr="00876F34">
<w:t xml:space="preserve"> Console.WriteLine("Hello World");</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRPr="00876F34" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
<w:r w:rsidRPr="00876F34">
<w:t xml:space="preserve"> }</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRPr="00876F34" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
<w:r w:rsidRPr="00876F34">
<w:t>}</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" />
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0">
<w:r>
<w:t>This example produces the following output:</w:t>
</w:r>
</w:p>
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" />
<w:p w:rsidR="00A75AE0" w:rsidRDefault="00A75AE0" w:rsidP="006027C7">
<w:pPr>
<w:pStyle w:val="Code" />
</w:pPr>
<w:r>
<w:t>Hello World</w:t>
</w:r>
</w:p>
<w:sectPr w:rsidR="00A75AE0" w:rsidSect="00A75AE0">
<w:pgSz w:w="12240" w:h="15840" />
<w:pgMar w:top="1440" w:right="1800" w:bottom="1440" w:left="1800" w:header="720" w:footer="720" w:gutter="0" />
<w:cols w:space="720" />
<w:docGrid w:linePitch="360" />
</w:sectPr>
</w:body>
</w:document>