Creating Data-Bound Content Controls using the Open XML SDK and LINQ to XML

Data-bound content controls are a powerful and convenient way to separate the semantic business data from the markup of an Open XML document.  After binding content controls to custom XML, you can query the document for the business data by looking in the custom XML part rather than examining the markup.  Querying custom XML is much simpler than querying the document body.  However, it’s a little bit involved to create data-bound content controls (but only a little bit).  But there is a trick we can do – we can take a document that has un-bound content controls, generate a custom XML part automatically (inferring the elements of the custom XML from the content controls), and then bind the content controls to the custom XML part.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC(Update March 10, 2009 - modified code to work with latest Open XML SDK.) 

This approach has two benefits – first, it can serve as a way to conveniently create a document with data-bound content controls, and second, it serves to demonstrate exactly what you must do to create data-bound content controls.

This example uses the following approach:

  • Using Word 2007, you create a document with any number of content controls in it.
  • When creating each content control, you set the Tag of the content control to the desired XML element name in the custom XML.
  • You then run this example code on the document, which creates the custom XML part, creates the custom XML properties part, and then adds the markup to the body of the document that binds each content control to the custom XML.

This example uses the Open XML SDK V1 and LINQ to XML.

Data-Bound Content Controls

A document that contains properly set-up data-bound content control has the following characteristics:

  • The main document part has a relation to the custom XML part.
  • The custom XML part has a relation to a custom XML properties part.
  • The custom XML properties part contains a GUID in an attribute (ds:itemID).  This GUID is used to associate the data binding elements in the main document part to the relevant custom XML part.
  • Within the content control markup in the main document part, the data binding element (w:dataBinding) defines the data binding.  This element has an attribute (w:storeItemID) that contains the same GUID as in the custom XML properties part.  In addition, the element has an attribute (w:xpath) that contains the XPath expression to the relevant node in the custom XML.

The following screen clipping shows the word document with content controls in the cells of a table:

To set the properties of the content control, click on the Content Controls Properties button (on the Developer tab of the ribbon):

In this example, the element name in the custom XML part comes from the Tag field in the content control properties window:

The following screen clipping (using the Open XML Package Editor, which comes with Visual Studio Power Tools) shows that there is a relation from the main document part (document.xml) to the custom XML part (../customXml/item1.xml):

The following shows the relation from the custom XML part to the custom XML properties part (itemProps1.xml):

The custom XML for the example included with this post looks like this:

<?xmlversion="1.0"encoding="utf-8"?>
<Root>
<Name>Eric White</Name>
<Company>Microsoft Corporation</Company>
<Address>One Microsoft Way</Address>
<City>Redmond</City>
<State>WA</State>
<Country>USA</Country>
<PostalCode>98052</PostalCode>
</Root>

This custom XML is automatically generated by this example.

The custom XML properties part looks like this:

<?xmlversion="1.0"encoding="utf-8"standalone="no"?>
<ds:datastoreItem
ds:itemID="{F351E99C-3283-4B75-927A-A56C9FD3BFFC}"
xmlns:ds="https://schemas.openxmlformats.org/officeDocument/2006/customXml">
<ds:schemaRefs/>
</ds:datastoreItem>

The GUID in the ds:itemID attribute is generated when the example is run.

The content control with properly set-up data binding looks like this:

<w:sdt>
<w:sdtPr>
<w:aliasw:val="Name"/>
<w:tagw:val="Name"/>
<w:idw:val="13264407"/>
<w:placeholder>
<w:docPartw:val="DefaultPlaceholder_22675703"/>
</w:placeholder>
<w:dataBinding
w:xpath="/Root/Name"
w:storeItemID="{F351E99C-3283-4B75-927A-A56C9FD3BFFC}"/>
<w:text/>
</w:sdtPr>
<w:sdtContent>
<w:tc>
<w:tcPr>
<w:tcWw:w="4410"
w:type="dxa"/>
</w:tcPr>
<w:pw:rsidR="00E850CC"
w:rsidRDefault="00FF4549"
w:rsidP="00FF4549">
<w:r>
<w:t>Eric White</w:t>
</w:r>
</w:p>
</w:tc>
</w:sdtContent>
</w:sdt>

The GUID in the w:storeItemID attribute is the same as in the custom XML properties part.  This creates the association between the data-bound content control and its custom XML part.

If you edit the document that has bound content controls, and change the contents in one of them, the custom XML is modified to reflect the changed content.  For instance, if you edit the document and change the name to Tai Yee, then the custom XML will be:

<?xmlversion="1.0"encoding="utf-8"?>
<Root>
<Name>Tai Yee</Name>
<Company>Microsoft Corporation</Company>
<Address>One Microsoft Way</Address>
<City>Redmond</City>
<State>WA</State>
<Country>USA</Country>
<PostalCode>98052</PostalCode>
</Root>

Because the GUID that creates the association is in the custom XML properties part and not in the custom XML itself, the custom XML can have any schema you desire.  You can take XML from any source, with any schema, and place it, unmodified, in a custom XML part, and create the appropriate data-binding to content controls.

Example using the Open XML SDK V1 and LINQ to XML

The example first copies Template.docx to Test.docx.  It opens Test.docx using the Open XML SDK, creates the custom XML part, creates the custom XML properties part, and then adds the data binding elements to the content controls in the main document part.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using System.Xml;
using System.Xml.Linq;
using DocumentFormat.OpenXml;
using DocumentFormat.OpenXml.Packaging;

public static class LocalExtensions
{
public static string StringConcatenate<T>(this IEnumerable<T> source,
Func<T, string> func)
{
StringBuilder sb = new StringBuilder();
foreach (T item in source)
sb.Append(func(item));
return sb.ToString();
}

public static string StringConcatenate(this IEnumerable<string> source)
{
StringBuilder sb = new StringBuilder();
foreach (string item in source)
sb.Append(item);
return sb.ToString();
}

public static XDocument GetXDocument(this OpenXmlPart part)
{
XDocument xdoc = part.Annotation<XDocument>();
if (xdoc != null)
return xdoc;
using (Stream str = part.GetStream())
using (StreamReader streamReader = new StreamReader(str))
using (XmlReader xr = XmlReader.Create(streamReader))
xdoc = XDocument.Load(xr);
part.AddAnnotation(xdoc);
return xdoc;
}
}

class Program
{
private static XNamespace w =
"https://schemas.openxmlformats.org/wordprocessingml/2006/main";
private static XName r = w + "r";
private static XName ins = w + "ins";
private static XNamespace ds =
"https://schemas.openxmlformats.org/officeDocument/2006/customXml";

static string GetTextFromContentControl(XElement contentControlNode)
{
return contentControlNode.Descendants(w + "p")
.Select(
p => p.Elements()
.Where(z => z.Name == r || z.Name == ins)
.Descendants(w + "t")
.StringConcatenate(element =>
(string)element) + Environment.NewLine
).StringConcatenate();
}

static void Main(string[] args)
{
File.Delete("Test.docx");
File.Copy("Template.docx", "Test.docx");

// Open the Open XML doc as a word processing doc
using (WordprocessingDocument document =
WordprocessingDocument.Open("Test.docx", true))
{
// Create the contents of the custom XML part
XElement customXml = new XElement("Root",
document
.MainDocumentPart
.GetXDocument()
.Descendants(w + "sdt")
.Select(sdt =>
new XElement(
sdt.Element(w + "sdtPr")
.Element(w + "tag")
.Attribute(w + "val").Value,
GetTextFromContentControl(sdt).Trim())
)
);

// Create a new custom XML part
CustomXmlPart customXmlPart =
document.MainDocumentPart.AddCustomXmlPart(CustomXmlPartType.CustomXml);
using (Stream str = customXmlPart.GetStream(
FileMode.Create, FileAccess.ReadWrite))
using (XmlWriter xw = XmlWriter.Create(str))
customXml.Save(xw);

Guid idGuid = Guid.NewGuid();

// Create the contents of the properties part
XDocument propertyPartXDoc = new XDocument(
new XElement(ds + "datastoreItem",
new XAttribute(ds + "itemID",
"{" + idGuid.ToString().ToUpper() + "}"),
new XAttribute(XNamespace.Xmlns + "ds",
ds.NamespaceName),
new XElement(ds + "schemaRefs")
)
);

// Add the custom XML properties part
CustomXmlPropertiesPart customXmlPropertyPart =
customXmlPart.AddNewPart<CustomXmlPropertiesPart>();
using (Stream str = customXmlPropertyPart.GetStream(
FileMode.Create, FileAccess.ReadWrite))
using (XmlWriter xw = XmlWriter.Create(str))
propertyPartXDoc.Save(xw);

// Load the main document part into an XDocument
XDocument mainDocumentXDoc;
using (Stream str = document.MainDocumentPart.GetStream())
using (XmlReader xr = XmlReader.Create(str))
mainDocumentXDoc = XDocument.Load(xr);

// Add the data binding elements to the main document
foreach (XElement sdt in mainDocumentXDoc.Descendants(w + "sdt"))
sdt.Element(w + "sdtPr")
.Element(w + "placeholder")
.AddAfterSelf(
new XElement(w + "dataBinding",
new XAttribute(w + "xpath",
"/Root/" + sdt.Element(w + "sdtPr")
.Element(w + "tag")
.Attribute(w + "val").Value),
new XAttribute(w + "storeItemID",
"{" + idGuid.ToString().ToUpper() + "}")
)
);

// Serialize the XDocument back into the part
using (Stream str = document.MainDocumentPart.GetStream(
FileMode.Create, FileAccess.Write))
using (XmlWriter xw = XmlWriter.Create(str))
mainDocumentXDoc.Save(xw);
}
}
}

Code is attached.

DataBoundContentControls.zip

Comments

  • Anonymous
    October 23, 2008
    PingBack from http://osrin.net/2008/10/eric-white-has-too-much-to-say/

  • Anonymous
    October 31, 2008
    Stephen McGibbon has screenshots of the Open XML and ODF support coming in Windows 7 Wordpad , as announced

  • Anonymous
    November 03, 2008
    Suite à la PDC 2008 et au workshop Open XML donné par Microsoft à Redmond ( Doug , encore mille excuses

  • Anonymous
    December 02, 2008
    Question regarding the GetTextFromContentControl method in your example. This looks for "p" elements and there is normally (as far as I've seen) no "p" tags within the "sdt" elements, which is the parameter into the method. Looking at some of my own Open XML documents, it looks like the following example would be more correct. Yet, this example does not support placeholders that allows carriage returns. e.Element(w + "sdtContent").Element(w + "r").Element(w + "t").Value.Trim() Additionally the code will fail whenever there is placeholders that does not have any tag specified, to avoid this you can make a check in the foreach loops, something like: if (sdt.Element(w + "sdtPr").Element(w + "tag") != null) Thanks for a great example!

  • Anonymous
    January 19, 2009
    I just read Brian Jones' <a href="http://blogs.msdn.com/brian_jones/archive/2009/01/05/taking-advantage-of-bound-content-controls.aspx" title="Taking Advantage of Bound Content Controls">post</a> where he completely swaps out the custom XML part. The code appears much more concise, but does it lack in the area of property reconstructing the Custom XML Part Properties?

  • Anonymous
    October 28, 2009
    Hi Eric, Can we do the custom binding for content controls that are in header and footer parts?

  • Anonymous
    March 01, 2010
    Is there any way I can toggle the content control bordering and highlighting? I have some content controls that are very close together and they exhibit some really strange behavior.

  • Anonymous
    March 10, 2010
    The comment has been removed

  • Anonymous
    March 10, 2010
    @Engr_Muneer, have you taken a look at "design mode" for content controls?  It can really help with how you interact with them.  Take a look at this post: http://blogs.msdn.com/ericwhite/archive/2010/03/02/using-nested-content-controls-for-data-and-content-extraction.aspx @Darin, I tried creating nested content controls using Word 2007, and it worked just fine for me.  I tried on multiple installs.  Can you try on some other Word 2007 installs, see if it works elsewhere? -Eric

  • Anonymous
    March 10, 2010
    @satchi, yes, you can link content controls in headers/footers to custom XML.  The XPath expression refers to elements/attributes in the custom XML part that is related to the main document part. -Eric

  • Anonymous
    March 11, 2010
    Very strange. I'm running word 12.0.4518.1000 (ie original shpping version from what I can tell). It definitely says that the doc has been corrupted once it's saved and reloaded. I went to a colleague's desk, he's running 12.0.6500,5000 (it says it's SP2) and his version works completely differently. No matter what we do at has desk, we can't get it to insert a nested content control at all. The ribbon buttons for controls on the developer ribbon are greyed when the cursor is in a content control. Strange

  • Anonymous
    March 11, 2010
    I'm running Win Update on this image now. Just have to see if maybe the Sp has something to do with it.

  • Anonymous
    March 11, 2010
    The comment has been removed

  • Anonymous
    March 11, 2010
    Thank you Darin for figuring this out.  This was one of those assumptions that was so ingrained in my mind that I forgot to mention.  I'm going to update the nested content control blog post to tell this. -Eric

  • Anonymous
    April 12, 2010
    Since the custom XML part is removed from Word from January 10.... Does anyone knows how to achieve content-controls/custom XML mapping in Word 2010? In other words, how it will be done in Word 2010? We are using method specified in this article for filling content controls from custom XML (Word 2007 - before January 10.), but how will we achieve that in Word 2010. I'm thinking about a future...

  • Anonymous
    April 12, 2010
    Hi George, Custom XML parts are not removed from Office 2007 or Office 2010.  Content controls are also not removed.  Binding content controls to custom XML in a custom XML part also is still supported.  The affected feature is 'custom XML markup', also known as 'pink tags'.  See the following blog post for more info. http://blogs.technet.com/gray_knowlton/archive/2009/12/23/what-is-custom-xml-and-the-impact-of-the-i4i-judgment-on-word.aspx -Eric

  • Anonymous
    June 08, 2011
    We had a document generation function based on the custom XML markup in word 2003 - the "pink tags" which I'm trying to work out how to convert so that the template documents can be maintained in word 2010. The big stumbling block is how to deal with nested data involving multiple rows - e.g. a document containing a customer with one or more orders each of which has one or more order items.  I can set up the nested custom controls but seem to be stuck with one order and one order item and I can't find any examples online dealing with this type of data. Can you give any guidance on this?

  • Anonymous
    July 03, 2011
    Hi Eric, I put the document generated from this example through the "Iterating through all Content Controls in an Open XML WordprocessingML Document" you wrote earlier and the contentcontrol tagged with "Name" is not detected. However if I move the "Name" contentcontrol to some other location it gets detected again. Can someone confirm this? Much appreciated.

  • Anonymous
    November 15, 2011
    How would you remove default text "Click here to enter text." for the blank content controls?

  • Anonymous
    August 07, 2012
    Thanks very much for this article, I appreciate it's old now but I wonder if anyone can help me. It works fine for very simple documents, but once I put in some formatting, the document adds a customXML part with item1.xml and item1Props.xml. When I use this code, I find that it generates the item.xml with the data but fails to create the itemProps.xml so it doesn't get linked in. ANy thoughts anyone?