Writing Entity References using LINQ to XML

I need to write out some XHtml, and in several places, I want that XHtml to contain entities references.  However, you can't simply write the entity reference like this:

This is one in a series of posts on transforming Open XML WordprocessingML to XHtml.  You can find the complete list of posts here.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOC

XElement p = new XElement("p", "  Hello");
Console.WriteLine(p);

LINQ to XML will replace the ampersands with its entity:

<p>&amp;nbsp;&amp;nbsp;Hello</p>

However, there is a super-easy trick/hack to write out entity references.  You write a new class (named XEntity, of course) that derives from XText and override the XText.WriteTo(XmlWriter writer) method.  Then, when creating your XML tree, you insert these objects as appropriate:

using System;
using System.Linq;
using System.Xml;
using System.Xml.Linq;

class XEntity : XText
{
public override void WriteTo(XmlWriter writer)
{
writer.WriteEntityRef(this.Value);
}
public XEntity(string value) : base(value) { }
}

class Program
{
static void Main(string[] args)
{
XElement p = new XElement("Root",
new XEntity("nbsp"),
new XEntity("nbsp"),
new XText("Hello"));
Console.WriteLine(p);
}
}

This produces the following output:

<Root>&nbsp;&nbsp;Hello</Root>

There are some caveats about this technique.  LINQ to XML doesn't know about this XEntity class.  It thinks XEntity nodes are just XText nodes.  If you use LINQ to XML to clone an element, then the node will be an XText node in the new tree:

XElement p = new XElement("Root",
new XEntity("nbsp"),
new XEntity("nbsp"),
new XText("Hello"));
XElement p2 = new XElement(p);
Console.WriteLine(p2);

This produces the following output:

<Root>nbspnbspHello</Root>

However, you can write your own clone method:

static object Clone(XNode node)
{
XElement element = node as XElement;
if (element != null)
{
return new XElement(element.Name,
element.Attributes(),
element.Nodes().Select(n => Clone(n)));
}
if (node is XEntity)
return new XEntity(((XText)node).Value);
return node;
}

static XElement Clone(XElement element)
{
return (XElement)Clone((XNode)element);
}

static void Main(string[] args)
{
XElement p = new XElement("Root",
new XEntity("nbsp"),
new XEntity("nbsp"),
new XText("Hello"));
XElement p2 = Clone(p);
Console.WriteLine(p2);
}

In my case, I create the XML tree immediately before serializing, so this isn't a problem.

Another gotcha is that LINQ to XML will sometimes merge two XText nodes into one.  If you pass a string instead of explicitly newing up an XText object, then LINQ to XML will merge the text of the string with the adjacent XEntity object:

XElement p = new XElement("Root",
new XEntity("nbsp"),
new XEntity("nbsp"),
"Hello");
Console.WriteLine(p);

This will result in output of:

<Root>&nbsp;&nbspHello;</Root>

So when creating XEntity objects, you have to be careful that you explicitly create adjacent XText objects.

With those caveats in place, if you simply need to serialize some XML that contains entities, this will do the trick.

Comments

  • Anonymous
    August 02, 2010
    This doesn't work if the entity is in an attribute value, such as <Root space=" ">Hello</Root>.  At least, I haven't been able to figure out how to make it work.  Any ideas?

  • Anonymous
    August 02, 2010
    The value of the space entity in my previous post is supposed to the nbsp entity.