System.Xml.XmlReader class
This article provides supplementary remarks to the reference documentation for this API.
XmlReader provides forward-only, read-only access to XML data in a document or stream. This class conforms to the W3C Extensible Markup Language (XML) 1.0 (fourth edition) and the Namespaces in XML 1.0 (third edition) recommendations.
XmlReader methods let you move through XML data and read the contents of a node. The properties of the class reflect the value of the current node, which is where the reader is positioned. The ReadState property value indicates the current state of the XML reader. For example, the property is set to ReadState.Initial by the XmlReader.Read method and ReadState.Closed by the XmlReader.Close method. XmlReader also provides data conformance checks and validation against a DTD or schema.
XmlReader uses a pull model to retrieve data. This model:
- Simplifies state management by a natural, top-down procedural refinement.
- Supports multiple input streams and layering.
- Enables the client to give the parser a buffer into which the string is directly written, and thus avoids the necessity of an extra string copy.
- Supports selective processing. The client can skip items and process those that are of interest to the application. You can also set properties in advance to manage how the XML stream is processed (for example, normalization).
Create an XML reader
Use the Create method to create an XmlReader instance.
Although .NET provides concrete implementations of the XmlReader class, such as the XmlTextReader, XmlNodeReader, and the XmlValidatingReader classes, we recommend that you use the specialized classes only in these scenarios:
- When you want to read an XML DOM subtree from an XmlNode object, use the XmlNodeReader class. (However, this class doesn't support DTD or schema validation.)
- If you must expand entities on request, you don't want your text content normalized, or you don't want default attributes returned, use the XmlTextReader class.
To specify the set of features you want to enable on the XML reader, pass an System.Xml.XmlReaderSettings object to the Create method. You can use a single System.Xml.XmlReaderSettings object to create multiple readers with the same functionality, or modify the System.Xml.XmlReaderSettings object to create a new reader with a different set of features. You can also easily add features to an existing reader.
If you don't use a System.Xml.XmlReaderSettings object, default settings are used. See the Create reference page for details.
XmlReader throws an XmlException on XML parse errors. After an exception is thrown, the state of the reader is not predictable. For example, the reported node type may be different from the actual node type of the current node. Use the ReadState property to check whether the reader is in error state.
Validate XML data
To define the structure of an XML document and its element relationships, data types, and content constraints, you use a document type definition (DTD) or XML Schema definition language (XSD) schema. An XML document is considered to be well formed if it meets all the syntactical requirements defined by the W3C XML 1.0 Recommendation. It's considered valid if it's well formed and also conforms to the constraints defined by its DTD or schema. (See the W3C XML Schema Part 1: Structures and the W3C XML Schema Part 2: Datatypes recommendations.) Therefore, although all valid XML documents are well formed, not all well-formed XML documents are valid.
You can validate the data against a DTD, an inline XSD Schema, or an XSD Schema stored in an XmlSchemaSet object (a cache); these scenarios are described on the Create reference page. XmlReader doesn't support XML-Data Reduced (XDR) schema validation.
You use the following settings on the XmlReaderSettings class to specify what type of validation, if any, the XmlReader instance supports.
Use this XmlReaderSettings member | To specify |
---|---|
DtdProcessing property | Whether to allow DTD processing. The default is to disallow DTD processing. |
ValidationType property | Whether the reader should validate data, and what type of validation to perform (DTD or schema). The default is no data validation. |
ValidationEventHandler event | An event handler for receiving information about validation events. If an event handler is not provided, an XmlException is thrown on the first validation error. |
ValidationFlags property | Additional validation options through the XmlSchemaValidationFlags enumeration members: - AllowXmlAttributes -- Allow XML attributes (xml:* ) in instance documents even when they're not defined in the schema. The attributes are validated based on their data type. See the XmlSchemaValidationFlags reference page for the setting to use in specific scenarios. (Disabled by default.)- ProcessIdentityConstraints --Process identity constraints (xs:ID , xs:IDREF , xs:key , xs:keyref , xs:unique ) encountered during validation. (Enabled by default.)- ProcessSchemaLocation --Process schemas specified by the xsi:schemaLocation or xsi:noNamespaceSchemaLocation attribute. (Enabled by default.)- ProcessInlineSchema -- Process inline XML Schemas during validation. (Disabled by default.)- ReportValidationWarnings --Report events if a validation warning occurs. A warning is typically issued when there is no DTD or XML Schema to validate a particular element or attribute against. The ValidationEventHandler is used for notification. (Disabled by default.) |
Schemas | The XmlSchemaSet to use for validation. |
XmlResolver property | The XmlResolver for resolving and accessing external resources. This can include external entities such as DTD and schemas, and any xs:include or xs:import elements contained in the XML Schema. If you don't specify an XmlResolver, the XmlReader uses a default XmlUrlResolver with no user credentials. |
Data conformance
XML readers that are created by the Create method meet the following compliance requirements by default:
New lines and attribute value are normalized according to the W3C XML 1.0 Recommendation.
All entities are automatically expanded.
Default attributes declared in the document type definition are always added even when the reader doesn't validate.
Declaration of XML prefix mapped to the correct XML namespace URI is allowed.
The notation names in a single
NotationType
attribute declaration andNmTokens
in a singleEnumeration
attribute declaration are distinct.
Use these XmlReaderSettings properties to specify the type of conformance checks you want to enable:
Use this XmlReaderSettings property | To | Default |
---|---|---|
CheckCharacters property | Enable or disable checks for the following: - Characters are within the range of legal XML characters, as defined by the 2.2 Characters section of the W3C XML 1.0 Recommendation. - All XML names are valid, as defined by the 2.3 Common Syntactic Constructs section of the W3C XML 1.0 Recommendation. When this property is set to true (default), an XmlException exception is thrown if the XML file contains illegal characters or invalid XML names (for example, an element name starts with a number). |
Character and name checking is enabled. Setting CheckCharacters to false turns off character checking for character entity references. If the reader is processing text data, it always checks that XML names are valid, regardless of this setting. Note: The XML 1.0 recommendation requires document-level conformance when a DTD is present. Therefore, if the reader is configured to support ConformanceLevel.Fragment, but the XML data contains a document type definition (DTD), an XmlException is thrown. |
ConformanceLevel property | Choose the level of conformance to enforce: - Document. Conforms to the rules for a well-formed XML 1.0 document. - Fragment. Conforms to the rules for a well-formed document fragment that can be consumed as an external parsed entity. - Auto. Conforms to the level decided by the reader. If the data isn't in conformance, an XmlException exception is thrown. |
Document |
Navigate through nodes
The current node is the XML node on which the XML reader is currently positioned. All XmlReader methods perform operations in relation to this node, and all XmlReader properties reflect the value of the current node.
The following methods make it easy to navigate through nodes and parse data.
Use this XmlReaderSettings method | To |
---|---|
Read | Read the first node, and advance through the stream one node at a time. Such calls are typically performed inside a while loop.Use the NodeType property to get the type (for example, attribute, comment, element, and so on) of the current node. |
Skip | Skip the children of the current node and move to the next node. |
MoveToContent and MoveToContentAsync | Skip non-content nodes and move to the next content node or to the end of the file. Non-content nodes include ProcessingInstruction, DocumentType, Comment, Whitespace, and SignificantWhitespace. Content nodes include non-white space text, CDATA, EntityReference , and EndEntity. |
ReadSubtree | Read an element and all its children, and return a new XmlReader instance set to ReadState.Initial. This method is useful for creating boundaries around XML elements; for example, if you want to pass data to another component for processing and you want to limit how much of your data the component can access. |
See the XmlReader.Read reference page for an example of navigating through a text stream one node at a time and displaying the type of each node.
The following sections describe how you can read specific types of data, such as elements, attributes, and typed data.
Read XML elements
The following table lists the methods and properties that the XmlReader class provides for processing elements. After the XmlReader is positioned on an element, the node properties, such as Name, reflect the element values. In addition to the members described below, any of the general methods and properties of the XmlReader class can also be used to process elements. For example, you can use the ReadInnerXml method to read the contents of an element.
Note
See section 3.1 of the W3C XML 1.0 Recommendation for definitions of start tags, end tags, and empty element tags.
Use this XmlReader member | To |
---|---|
IsStartElement method | Check if the current node is a start tag or an empty element tag. |
ReadStartElement method | Check that the current node is an element and advance the reader to the next node (calls IsStartElement followed by Read). |
ReadEndElement method | Check that the current node is an end tag and advance the reader to the next node. |
ReadElementString method | Read a text-only element. |
ReadToDescendant method | Advance the XML reader to the next descendant (child) element that has the specified name. |
ReadToNextSibling method | Advance the XML reader to the next sibling element that has the specified name. |
IsEmptyElement property | Check if the current element has an end element tag. For example: - <item num="123"/> (IsEmptyElement is true .)- <item num="123"> </item> (IsEmptyElement is false , although the element's content is empty.) |
For an example of reading the text content of elements, see the ReadString method. The following example processes elements by using a while
loop.
while (reader.Read()) {
if (reader.IsStartElement()) {
if (reader.IsEmptyElement)
{
Console.WriteLine("<{0}/>", reader.Name);
}
else {
Console.Write("<{0}> ", reader.Name);
reader.Read(); // Read the start tag.
if (reader.IsStartElement()) // Handle nested elements.
Console.Write("\r\n<{0}>", reader.Name);
Console.WriteLine(reader.ReadString()); //Read the text content of the element.
}
}
}
While reader.Read()
If reader.IsStartElement() Then
If reader.IsEmptyElement Then
Console.WriteLine("<{0}/>", reader.Name)
Else
Console.Write("<{0}> ", reader.Name)
reader.Read() ' Read the start tag.
If reader.IsStartElement() Then ' Handle nested elements.
Console.Write(vbCr + vbLf + "<{0}>", reader.Name)
End If
Console.WriteLine(reader.ReadString()) 'Read the text content of the element.
End If
End If
End While
Read XML attributes
XML attributes are most commonly found on elements, but they're also allowed on XML declaration and document type nodes.
When positioned on an element node, the MoveToAttribute method lets you go through the attribute list of the element. Note that after MoveToAttribute has been called, node properties such as Name, NamespaceURI, and Prefix reflect the properties of that attribute, not the properties of the element the attribute belongs to.
The XmlReader class provides these methods and properties to read and process attributes on elements.
Use this XmlReader member | To |
---|---|
HasAttributes property | Check whether the current node has any attributes. |
AttributeCount property | Get the number of attributes on the current element. |
MoveToFirstAttribute method | Move to the first attribute in an element. |
MoveToNextAttribute method | Move to the next attribute in an element. |
MoveToAttribute method | Move to a specified attribute. |
GetAttribute method or Item[] property | Get the value of a specified attribute. |
IsDefault property | Check whether the current node is an attribute that was generated from the default value defined in the DTD or schema. |
MoveToElement method | Move to the element that owns the current attribute. Use this method to return to an element after navigating through its attributes. |
ReadAttributeValue method | Parse the attribute value into one or more Text , EntityReference , or EndEntity nodes. |
Any of the general XmlReader methods and properties can also be used to process attributes. For example, after the XmlReader is positioned on an attribute, the Name and Value properties reflect the values of the attribute. You can also use any of the content Read
methods to get the value of the attribute.
This example uses the AttributeCount property to navigate through all the attributes on an element.
// Display all attributes.
if (reader.HasAttributes) {
Console.WriteLine("Attributes of <" + reader.Name + ">");
for (int i = 0; i < reader.AttributeCount; i++) {
Console.WriteLine(" {0}", reader[i]);
}
// Move the reader back to the element node.
reader.MoveToElement();
}
' Display all attributes.
If reader.HasAttributes Then
Console.WriteLine("Attributes of <" + reader.Name + ">")
Dim i As Integer
For i = 0 To (reader.AttributeCount - 1)
Console.WriteLine(" {0}", reader(i))
Next i
' Move the reader back to the element node.
reader.MoveToElement()
End If
This example uses the MoveToNextAttribute method in a while
loop to navigate through the attributes.
if (reader.HasAttributes) {
Console.WriteLine("Attributes of <" + reader.Name + ">");
while (reader.MoveToNextAttribute()) {
Console.WriteLine(" {0}={1}", reader.Name, reader.Value);
}
// Move the reader back to the element node.
reader.MoveToElement();
}
If reader.HasAttributes Then
Console.WriteLine("Attributes of <" + reader.Name + ">")
While reader.MoveToNextAttribute()
Console.WriteLine(" {0}={1}", reader.Name, reader.Value)
End While
' Move the reader back to the element node.
reader.MoveToElement()
End If
Reading attributes on XML declaration nodes
When the XML reader is positioned on an XML declaration node, the Value property returns the version, standalone, and encoding information as a single string. XmlReader objects created by the Create method, the XmlTextReader class, and the XmlValidatingReader class expose the version, standalone, and encoding items as attributes.
Reading attributes on document type nodes
When the XML reader is positioned on a document type node, the GetAttribute method and Item[] property can be used to return the values for the SYSTEM and PUBLIC literals. For example, calling reader.GetAttribute("PUBLIC")
returns the PUBLIC value.
Reading attributes on processing instruction nodes
When the XmlReader is positioned on a processing instruction node, the Value property returns the entire text content. Items in the processing instruction node aren't treated as attributes. They can't be read with the GetAttribute or MoveToAttribute method.
Read XML content
The XmlReader class includes the following members that read content from an XML file and return the content as string values. (To return CLR types, see Convert to CLR types.)
Use this XmlReader member | To |
---|---|
Value property | Get the text content of the current node. The value returned depends on the node type; see the Value reference page for details. |
ReadString method | Get the content of an element or text node as a string. This method stops on processing instructions and comments. For details on how this method handles specific node types, see the ReadString reference page. |
ReadInnerXml and ReadInnerXmlAsync methods | Get all the content of the current node, including the markup, but excluding start and end tags. For example, for:<node>this<child id="123"/></node> ReadInnerXml returns: this<child id="123"/> |
ReadOuterXml and ReadOuterXmlAsync methods | Get all the content of the current node and its children, including markup and start/end tags. For example, for:<node>this<child id="123"/></node> ReadOuterXml returns: <node>this<child id="123"/></node> |
Convert to CLR types
You can use the members of the XmlReader class (listed in the following table) to read XML data and return values as common language runtime (CLR) types instead of strings. These members enable you to get values in the representation that is most appropriate for your coding task without having to manually parse or convert string values.
The ReadElementContentAs methods can only be called on element node types. These methods cannot be used on elements that contain child elements or mixed content. When called, the XmlReader object reads the start tag, reads the element content, and then moves past the end element tag. Processing instructions and comments are ignored and entities are expanded.
The ReadContentAs methods read the text content at the current reader position, and if the XML data doesn't have any schema or data type information associated with it, convert the text content to the requested return type. Text, white space, significant white space and CDATA sections are concatenated. Comments and processing instructions are skipped, and entity references are automatically resolved.
The XmlReader class uses the rules defined by the W3C XML Schema Part 2: Datatypes recommendation.
Use this XmlReader method | To return this CLR type |
---|---|
ReadContentAsBoolean and ReadElementContentAsBoolean | Boolean |
ReadContentAsDateTime and ReadElementContentAsDateTime | DateTime |
ReadContentAsDouble and ReadElementContentAsDouble | Double |
ReadContentAsLong and ReadElementContentAsLong | Int64 |
ReadContentAsInt and ReadElementContentAsInt | Int32 |
ReadContentAsString and ReadElementContentAsString | String |
ReadContentAs and ReadElementContentAs | The type you specify with the returnType parameter |
ReadContentAsObject and ReadElementContentAsObject | The most appropriate type, as specified by the XmlReader.ValueType property. See Type Support in the System.Xml Classes for mapping information. |
If an element can't easily be converted to a CLR type because of its format, you can use a schema mapping to ensure a successful conversion. The following example uses an .xsd file to convert the hire-date
element to the xs:date
type, and then uses the ReadElementContentAsDateTime method to return the element as a DateTime object.
Input (hireDate.xml):
<employee xmlns="urn:empl-hire">
<ID>12365</ID>
<hire-date>2003-01-08</hire-date>
<title>Accountant</title>
</employee>
Schema (hireDate.xsd):
<?xml version="1.0"?>
<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" targetNamespace="urn:empl-hire" xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="employee">
<xs:complexType>
<xs:sequence>
<xs:element name="ID" type="xs:unsignedShort" />
<xs:element name="hire-date" type="xs:date" />
<xs:element name="title" type="xs:string" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Code:
// Create a validating XmlReader object. The schema
// provides the necessary type information.
XmlReaderSettings settings = new XmlReaderSettings();
settings.ValidationType = ValidationType.Schema;
settings.Schemas.Add("urn:empl-hire", "hireDate.xsd");
using (XmlReader reader = XmlReader.Create("hireDate.xml", settings)) {
// Move to the hire-date element.
reader.MoveToContent();
reader.ReadToDescendant("hire-date");
// Return the hire-date as a DateTime object.
DateTime hireDate = reader.ReadElementContentAsDateTime();
Console.WriteLine("Six Month Review Date: {0}", hireDate.AddMonths(6));
}
' Create a validating XmlReader object. The schema
' provides the necessary type information.
Dim settings As XmlReaderSettings = New XmlReaderSettings()
settings.ValidationType = ValidationType.Schema
settings.Schemas.Add("urn:empl-hire", "hireDate.xsd")
Using reader As XmlReader = XmlReader.Create("hireDate.xml", settings)
' Move to the hire-date element.
reader.MoveToContent()
reader.ReadToDescendant("hire-date")
' Return the hire-date as a DateTime object.
Dim hireDate As DateTime = reader.ReadElementContentAsDateTime()
Console.WriteLine("Six Month Review Date: {0}", hireDate.AddMonths(6))
End Using
Output:
Six Month Review Date: 7/8/2003 12:00:00 AM
Asynchronous programming
Most of the XmlReader methods have asynchronous counterparts that have "Async" at the end of their method names. For example, the asynchronous equivalent of ReadContentAsObject is ReadContentAsObjectAsync.
The following methods can be used with asynchronous method calls:
- GetAttribute
- MoveToAttribute
- MoveToFirstAttribute
- MoveToNextAttribute
- MoveToElement
- ReadAttributeValue
- ReadSubtree
- ResolveEntity
The following sections describe asynchronous usage for methods that don't have asynchronous counterparts.
ReadStartElement method
public static async Task ReadStartElementAsync(this XmlReader reader, string localname, string ns)
{
if (await reader.MoveToContentAsync() != XmlNodeType.Element)
{
throw new InvalidOperationException(reader.NodeType.ToString() + " is an invalid XmlNodeType");
}
if ((reader.LocalName == localname) && (reader.NamespaceURI == ns))
{
await reader.ReadAsync();
}
else
{
throw new InvalidOperationException("localName or namespace doesn’t match");
}
}
<Extension()>
Public Async Function ReadStartElementAsync(reader As XmlReader, localname As String, ns As String) As Task
If (Await reader.MoveToContentAsync() <> XmlNodeType.Element) Then
Throw New InvalidOperationException(reader.NodeType.ToString() + " is an invalid XmlNodeType")
End If
If ((reader.LocalName = localname) And (reader.NamespaceURI = ns)) Then
Await reader.ReadAsync()
Else
Throw New InvalidOperationException("localName or namespace doesn’t match")
End If
End Function
ReadEndElement method
public static async Task ReadEndElementAsync(this XmlReader reader)
{
if (await reader.MoveToContentAsync() != XmlNodeType.EndElement)
{
throw new InvalidOperationException();
}
await reader.ReadAsync();
}
<Extension()>
Public Async Function ReadEndElementAsync(reader As XmlReader) As task
If (Await reader.MoveToContentAsync() <> XmlNodeType.EndElement) Then
Throw New InvalidOperationException()
End If
Await reader.ReadAsync()
End Function
ReadToNextSibling method
public static async Task<bool> ReadToNextSiblingAsync(this XmlReader reader, string localName, string namespaceURI)
{
if (localName == null || localName.Length == 0)
{
throw new ArgumentException("localName is empty or null");
}
if (namespaceURI == null)
{
throw new ArgumentNullException("namespaceURI");
}
// atomize local name and namespace
localName = reader.NameTable.Add(localName);
namespaceURI = reader.NameTable.Add(namespaceURI);
// find the next sibling
XmlNodeType nt;
do
{
await reader.SkipAsync();
if (reader.ReadState != ReadState.Interactive)
break;
nt = reader.NodeType;
if (nt == XmlNodeType.Element &&
((object)localName == (object)reader.LocalName) &&
((object)namespaceURI ==(object)reader.NamespaceURI))
{
return true;
}
} while (nt != XmlNodeType.EndElement && !reader.EOF);
return false;
}
<Extension()>
Public Async Function ReadToNextSiblingAsync(reader As XmlReader, localName As String, namespaceURI As String) As Task(Of Boolean)
If (localName = Nothing Or localName.Length = 0) Then
Throw New ArgumentException("localName is empty or null")
End If
If (namespaceURI = Nothing) Then
Throw New ArgumentNullException("namespaceURI")
End If
' atomize local name and namespace
localName = reader.NameTable.Add(localName)
namespaceURI = reader.NameTable.Add(namespaceURI)
' find the next sibling
Dim nt As XmlNodeType
Do
Await reader.SkipAsync()
If (reader.ReadState <> ReadState.Interactive) Then
Exit Do
End If
nt = reader.NodeType
If ((nt = XmlNodeType.Element) And
((CObj(localName) = CObj(reader.LocalName))) And
(CObj(namespaceURI) = CObj(reader.NamespaceURI))) Then
Return True
End If
Loop While (nt <> XmlNodeType.EndElement And (Not reader.EOF))
Return False
End Function
ReadToFollowing method
public static async Task<bool> ReadToFollowingAsync(this XmlReader reader, string localName, string namespaceURI)
{
if (localName == null || localName.Length == 0)
{
throw new ArgumentException("localName is empty or null");
}
if (namespaceURI == null)
{
throw new ArgumentNullException("namespaceURI");
}
// atomize local name and namespace
localName = reader.NameTable.Add(localName);
namespaceURI = reader.NameTable.Add(namespaceURI);
// find element with that name
while (await reader.ReadAsync())
{
if (reader.NodeType == XmlNodeType.Element && ((object)localName == (object)reader.LocalName) && ((object)namespaceURI == (object)reader.NamespaceURI))
{
return true;
}
}
return false;
}
<Extension()>
Public Async Function ReadToFollowingAsync(reader As XmlReader, localName As String, namespaceURI As String) As Task(Of Boolean)
If (localName = Nothing Or localName.Length = 0) Then
Throw New ArgumentException("localName is empty or null")
End If
If (namespaceURI = Nothing) Then
Throw New ArgumentNullException("namespaceURI")
End If
' atomize local name and namespace
localName = reader.NameTable.Add(localName)
namespaceURI = reader.NameTable.Add(namespaceURI)
' find element with that name
While (Await reader.ReadAsync())
If ((reader.NodeType = XmlNodeType.Element) And
(CObj(localName) = CObj(reader.LocalName)) And
(CObj(namespaceURI) = CObj(reader.NamespaceURI))) Then
Return True
End If
End While
Return False
End Function
ReadToDescendant method
public static async Task<bool> ReadToDescendantAsync(this XmlReader reader, string localName, string namespaceURI)
{
if (localName == null || localName.Length == 0)
{
throw new ArgumentException("localName is empty or null");
}
if (namespaceURI == null)
{
throw new ArgumentNullException("namespaceURI");
}
// save the element or root depth
int parentDepth = reader.Depth;
if (reader.NodeType != XmlNodeType.Element)
{
// adjust the depth if we are on root node
if (reader.ReadState == ReadState.Initial)
{
parentDepth--;
}
else
{
return false;
}
}
else if (reader.IsEmptyElement)
{
return false;
}
// atomize local name and namespace
localName = reader.NameTable.Add(localName);
namespaceURI = reader.NameTable.Add(namespaceURI);
// find the descendant
while (await reader.ReadAsync() && reader.Depth > parentDepth)
{
if (reader.NodeType == XmlNodeType.Element && ((object)localName == (object)reader.LocalName) && ((object)namespaceURI == (object)reader.NamespaceURI))
{
return true;
}
}
return false;
}
<Extension()>
Public Async Function ReadToDescendantAsync(reader As XmlReader, localName As String, namespaceURI As String) As Task(Of Boolean)
If (localName = Nothing Or localName.Length = 0) Then
Throw New ArgumentException("localName is empty or null")
End If
If (namespaceURI = Nothing) Then
Throw New ArgumentNullException("namespaceURI")
End If
' save the element or root depth
Dim parentDepth As Integer = reader.Depth
If (reader.NodeType <> XmlNodeType.Element) Then
' adjust the depth if we are on root node
If (reader.ReadState = ReadState.Initial) Then
parentDepth -= 1
Else
Return False
End If
ElseIf (reader.IsEmptyElement) Then
Return False
End If
' atomize local name and namespace
localName = reader.NameTable.Add(localName)
namespaceURI = reader.NameTable.Add(namespaceURI)
' find the descendant
While (Await reader.ReadAsync() And reader.Depth > parentDepth)
If (reader.NodeType = XmlNodeType.Element And
(CObj(localName) = CObj(reader.LocalName)) And
(CObj(namespaceURI) = CObj(reader.NamespaceURI))) Then
Return True
End If
End While
Return False
End Function
Security considerations
Consider the following when working with the XmlReader class:
Exceptions thrown from the XmlReader can disclose path information that you might not want bubbled up to your app. Your app must catch exceptions and process them appropriately.
Do not enable DTD processing if you're concerned about denial of service issues or if you're dealing with untrusted sources. DTD processing is disabled by default for XmlReader objects created by the Create method.
If you have DTD processing enabled, you can use the XmlSecureResolver to restrict the resources that the XmlReader can access. You can also design your app so that the XML processing is memory and time constrained. For example, you can configure time-out limits in your ASP.NET app.
XML data can include references to external resources such as a schema file. By default, external resources are resolved by using an XmlUrlResolver object with no user credentials. You can secure this further by doing one of the following:
Restrict the resources that the XmlReader can access by setting the XmlReaderSettings.XmlResolver property to an XmlSecureResolver object.
Do not allow the XmlReader to open any external resources by setting the XmlReaderSettings.XmlResolver property to
null
.
The ProcessInlineSchema and ProcessSchemaLocation validation flags of an XmlReaderSettings object aren't set by default. This helps to protect the XmlReader against schema-based attacks when it is processing XML data from an untrusted source. When these flags are set, the XmlResolver of the XmlReaderSettings object is used to resolve schema locations encountered in the instance document in the XmlReader. If the XmlResolver property is set to
null
, schema locations aren't resolved even if the ProcessInlineSchema and ProcessSchemaLocation validation flags are set.Schemas added during validation add new types and can change the validation outcome of the document being validated. As a result, external schemas should only be resolved from trusted sources.
We recommend disabling the ProcessIdentityConstraints flag when validating untrusted, large XML documents in high availability scenarios against a schema that has identity constraints over a large part of the document. This flag is enabled by default.
XML data can contain a large number of attributes, namespace declarations, nested elements and so on that require a substantial amount of time to process. To limit the size of the input that is sent to the XmlReader, you can:
Limit the size of the document by setting the MaxCharactersInDocument property.
Limit the number of characters that result from expanding entities by setting the MaxCharactersFromEntities property.
Create a custom
IStream
implementation for the XmlReader.
The ReadValueChunk method can be used to handle large streams of data. This method reads a small number of characters at a time instead of allocating a single string for the whole value.
When reading an XML document with a large number of unique local names, namespaces, or prefixes, a problem can occur. If you are using a class that derives from XmlReader, and you call the LocalName, Prefix, or NamespaceURI property for each item, the returned string is added to a NameTable. The collection held by the NameTable never decreases in size, creating a virtual memory leak of the string handles. One mitigation for this is to derive from the NameTable class and enforce a maximum size quota. (There is no way to prevent the use of a NameTable, or to switch the NameTable when it is full). Another mitigation is to avoid using the properties mentioned and instead use the MoveToAttribute method with the IsStartElement method where possible; those methods don't return strings and thus avoid the problem of overfilling the NameTable collection.
XmlReaderSettings objects can contain sensitive information such as user credentials. An untrusted component could use the XmlReaderSettings object and its user credentials to create XmlReader objects to read data. Be careful when caching XmlReaderSettings objects, or when passing the XmlReaderSettings object from one component to another.
Do not accept supporting components, such as NameTable, XmlNamespaceManager, and XmlResolver objects, from an untrusted source.