寫入 XML 內容時的特殊字元轉換
更新: November 2007
XmlWriter 包括 WriteRaw 方法,可讓您手動寫出原始標記。這個方法會防止逸出特殊字元。相對於此方法,WriteString 方法則會將部分字串逸出至對等的實體參考。逸出的字元指定於 XML 1.0 版建議事項<2.4 字元資料及標記>(英文) 一節,以及可延伸標記語言 (XML) 1.0 版 (第二版) 建議事項的<3.3.3 屬性值正規化>(英文) 一節中。如果在寫入屬性值時呼叫 WriteString 方法,則它會逸出 ' 和 "。字元值 0x-0x1F 會被編碼為 � 與  之間的數字字元實體,0x9、0x10 與 0x13 等泛空白字元除外。
因此,WriteString 或 WritingRaw 的使用時機依據如下:如需逐字元尋找實體字元,則使用 WriteString,而 WriteRaw 則會確實寫入所指定的內容。
WriteNode 方法會複製目前節點的所有內容,而讀取器會定位於寫入器上。接著讀取器會提升至下一個同層級節點上,以進行進一步的處理。writeNode 方法可讓您在文件之間快速地擷取資訊。
下列表格列出 WriteNode 方法的支援 NodeTypes。
節點型別 |
說明 |
---|---|
Element |
寫出項目節點與所有屬性節點。 |
Attribute |
無作業。請使用 WriteStartAttribute 或 WriteAttributeString 寫入屬性。 |
Text |
寫出文字節點。 |
CDATA |
寫出 CDATA 區段節點。 |
EntityReference |
寫入 Entity Ref 節點。 |
ProcessingInstruction |
寫入 PI 節點。 |
Comment |
寫入 Comment 節點。 |
DocumentType |
寫入 DocType 節點。 |
Whitespace |
寫入 Whitespace 節點。 |
SignificantWhitespace |
寫出 Whitespace 節點。 |
EndElement |
無作業。 |
EndEntity |
無作業。 |
下列範例將說明在指定 "<" 字元時,WriteString 與 WriteRaw 方法之間有何不同。以下程式碼範例使用 WriteString。
w.WriteStartElement("myRoot")
w.WriteString("<")
w.WriteEndElement()
Dim tw As New XmlTextWriter(Console.Out)
tw.WriteDocType(name, pubid, sysid, subset)
w.WriteStartElement("myRoot");
w.WriteString("<");
w.WriteEndElement();
XmlTextWriter tw = new XmlTextWriter(Console.Out);
tw.WriteDocType(name, pubid, sysid, subset);
輸出
<myRoot><</myRoot>
此程式碼範例使用 WriteRaw,且輸出使用不合法的字元作為項目內容。
w.WriteStartElement("myRoot")
w.WriteRaw("<")
w.WriteEndElement()
w.WriteStartElement("myRoot");
w.WriteRaw("<");
w.WriteEndElement();
輸出
<myRoot><</myRoot>
下列範例顯示如何將 XML 文件從項目中心的文件轉換成屬性中心的文件。您也可以將 XML 屬性中心的文件轉換回項目中心的文件。項目中心的模式表示 XML 文件乃設計成有較多的項目,較少的屬性。屬性中心的設計有較少的項目,而且將為項目中心設計內的項目已被變更為項目的屬性。因此,每個項目有較少的項目,較多的屬性。
如果您已經將 XML 資料設計成任一個模式,則這個範例相當有用,因為這個範例允許它被轉換成其他模式。
下列 XML 使用項目中心的文件。項目沒有包含屬性。
輸入 - centric.xml
<?xml version='1.0' encoding='UTF-8'?>
<root>
<Customer>
<firstname>Jerry</firstname>
<lastname>Larson</lastname>
<Order>
<OrderID>Ord-12345</OrderID>
<OrderDetail>
<Quantity>1301</Quantity>
<UnitPrice>$3000</UnitPrice>
<ProductName>Computer</ProductName>
</OrderDetail>
</Order>
</Customer>
</root>
下列範例應用程式執行這項轉換。
' The program will convert an element-centric document to an
' attribute-centric document or element-centric to attribute-centric.
Imports System
Imports System.Xml
Imports System.IO
Imports System.Text
Imports System.Collections
Class ModeConverter
Private bufferSize As Integer = 2048
Friend Class ElementNode
Private _name As [String]
Private _prefix As [String]
Private _namespace As [String]
Private _startElement As Boolean
Friend Sub New()
Me._name = Nothing
Me._prefix = Nothing
Me._namespace = Nothing
Me._startElement = False
End Sub 'New
Friend Sub New(prefix As [String], name As [String], [nameSpace] As [String])
Me._name = name
Me._prefix = prefix
Me._namespace = [nameSpace]
End Sub 'New
Public ReadOnly Property name() As [String]
Get
Return _name
End Get
End Property
Public ReadOnly Property prefix() As [String]
Get
Return _prefix
End Get
End Property
Public ReadOnly Property [nameSpace]() As [String]
Get
Return _namespace
End Get
End Property
Public Property startElement() As Boolean
Get
Return _startElement
End Get
Set
_startElement = value
End Set
End Property
End Class 'ElementNode
' Entry point which delegates to C-style main Private Function.
Public Overloads Shared Sub Main()
Main(System.Environment.GetCommandLineArgs())
End Sub
Overloads Public Shared Sub Main(args() As [String])
Dim modeConverter As New ModeConverter()
If args(0) Is Nothing Or args(0) = "?" Or args.Length < 2 Then
modeConverter.Usage()
Return
End If
Dim sourceFile As New FileStream(args(1), FileMode.Open, FileAccess.Read, FileShare.Read)
Dim targetFile As New FileStream(args(2), FileMode.Create, FileAccess.ReadWrite, FileShare.ReadWrite)
If args(0) = "-a" Then
modeConverter.ConertToAttributeCentric(sourceFile, targetFile)
Else
modeConverter.ConertToElementCentric(sourceFile, targetFile)
End If
Return
End Sub 'Main
Public Sub Usage()
Console.WriteLine("? This help message " + ControlChars.Lf)
Console.WriteLine("Convert -mode sourceFile, targetFile " + ControlChars.Lf)
Console.WriteLine(ControlChars.Tab + " mode: e element centric" + ControlChars.Lf)
Console.WriteLine(ControlChars.Tab + " mode: a attribute centric" + ControlChars.Lf)
End Sub 'Usage
Public Sub ConertToAttributeCentric(sourceFile As FileStream, targetFile As FileStream)
' Stack is used to track how many.
Dim stack As New Stack()
Dim reader As New XmlTextReader(sourceFile)
reader.Read()
Dim writer As New XmlTextWriter(targetFile, reader.Encoding)
writer.Formatting = Formatting.Indented
Do
Select Case reader.NodeType
Case XmlNodeType.XmlDeclaration
writer.WriteStartDocument((Nothing = reader.GetAttribute("standalone") Or "yes" = reader.GetAttribute("standalone")))
Case XmlNodeType.Element
Dim element As New ElementNode(reader.Prefix, reader.LocalName, reader.NamespaceURI)
If 0 = stack.Count Then
writer.WriteStartElement(element.prefix, element.name, element.nameSpace)
element.startElement = True
End If
stack.Push(element)
Case XmlNodeType.Attribute
Throw New Exception("We should never been here!")
Case XmlNodeType.Text
Dim attribute As New ElementNode()
attribute = CType(stack.Pop(), ElementNode)
element = CType(stack.Peek(), ElementNode)
If Not element.startElement Then
writer.WriteStartElement(element.prefix, element.name, element.nameSpace)
element.startElement = True
End If
writer.WriteStartAttribute(attribute.prefix, attribute.name, attribute.nameSpace)
writer.WriteRaw(reader.Value)
reader.Read() 'jump over the EndElement
Case XmlNodeType.EndElement
writer.WriteEndElement()
stack.Pop()
Case XmlNodeType.CDATA
writer.WriteCData(reader.Value)
Case XmlNodeType.Comment
writer.WriteComment(reader.Value)
Case XmlNodeType.ProcessingInstruction
writer.WriteProcessingInstruction(reader.Name, reader.Value)
Case XmlNodeType.EntityReference
writer.WriteEntityRef(reader.Name)
Case XmlNodeType.Whitespace
writer.WriteWhitespace(reader.Value);
Case XmlNodeType.None
writer.WriteRaw(reader.Value)
Case XmlNodeType.SignificantWhitespace
writer.WriteWhitespace(reader.Value)
Case XmlNodeType.DocumentType
writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"), reader.GetAttribute("SYSTEM"), reader.Value)
Case XmlNodeType.EndEntity
Case Else
Console.WriteLine(("UNKNOWN Node Type = " + CInt(reader.NodeType)))
End Select
Loop While reader.Read()
writer.WriteEndDocument()
reader.Close()
writer.Flush()
writer.Close()
End Sub 'ConertToAttributeCentric
' Use the WriteNode to simplify the process.
Public Sub ConertToElementCentric(sourceFile As FileStream, targetFile As FileStream)
Dim reader As New XmlTextReader(sourceFile)
reader.Read()
Dim writer As New XmlTextWriter(targetFile, reader.Encoding)
writer.Formatting = Formatting.Indented
Do
Select Case reader.NodeType
Case XmlNodeType.Element
writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI)
If reader.MoveToFirstAttribute() Then
Do
writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI)
writer.WriteRaw(reader.Value)
writer.WriteEndElement()
Loop While reader.MoveToNextAttribute()
writer.WriteEndElement()
End If
Case XmlNodeType.Attribute
Throw New Exception("We should never been here!")
Case XmlNodeType.Whitespace
writer.WriteWhitespace(reader.Value)
Case XmlNodeType.EndElement
writer.WriteEndElement()
Case XmlNodeType.Text
Throw New Exception("The input document is not a attribute centric document" + ControlChars.Lf)
Case Else
Console.WriteLine(reader.NodeType)
writer.WriteNode(reader, False)
End Select
Loop While reader.Read()
reader.Close()
writer.Flush()
writer.Close()
End Sub 'ConertToElementCentric
End Class 'ModeConverter
// The program will convert an element-centric document to an
// attribute-centric document or element-centric to attribute-centric.
using System;
using System.Xml;
using System.IO;
using System.Text;
using System.Collections;
class ModeConverter {
private const int bufferSize=2048;
internal class ElementNode {
String _name;
String _prefix;
String _namespace;
bool _startElement;
internal ElementNode() {
this._name = null;
this._prefix = null;
this._namespace = null;
this._startElement = false;
}
internal ElementNode(String prefix, String name, String nameSpace) {
this._name = name;
this._prefix = prefix;
this._namespace = nameSpace;
}
public String name{
get { return _name; }
}
public String prefix{
get { return _prefix; }
}
public String nameSpace{
get { return _namespace; }
}
public bool startElement{
get { return _startElement; }
set { _startElement = value;}
}
}
public static void Main(String[] args) {
ModeConverter modeConverter = new ModeConverter();
if (args[0]== null || args[0]== "?" || args.Length < 2 ) {
modeConverter.Usage();
return;
}
FileStream sourceFile = new FileStream(args[1], FileMode.Open, FileAccess.Read, FileShare.Read);
FileStream targetFile = new FileStream(args[2], FileMode.Create, FileAccess.ReadWrite, FileShare.ReadWrite);
if (args[0] == "-a") {
modeConverter.ConertToAttributeCentric(sourceFile, targetFile);
} else {
modeConverter.ConertToElementCentric(sourceFile, targetFile);
}
return;
}
public void Usage() {
Console.WriteLine("? This help message \n");
Console.WriteLine("Convert -mode sourceFile, targetFile \n");
Console.WriteLine("\t mode: e element centric\n");
Console.WriteLine("\t mode: a attribute centric\n");
}
public void ConertToAttributeCentric(FileStream sourceFile, FileStream targetFile) {
// Stack is used to track how many.
Stack stack = new Stack();
XmlTextReader reader = new XmlTextReader(sourceFile);
reader.Read();
XmlTextWriter writer = new XmlTextWriter(targetFile, reader.Encoding);
writer.Formatting = Formatting.Indented;
do {
switch (reader.NodeType) {
case XmlNodeType.XmlDeclaration:
writer.WriteStartDocument(null == reader.GetAttribute("standalone") || "yes" == reader.GetAttribute("standalone"));
break;
case XmlNodeType.Element:
ElementNode element = new ElementNode(reader.Prefix, reader.LocalName, reader.NamespaceURI);
if (0 == stack.Count) {
writer.WriteStartElement(element.prefix, element.name, element.nameSpace);
element.startElement=true;
}
stack.Push(element);
break;
case XmlNodeType.Attribute:
throw new Exception("We should never been here!");
case XmlNodeType.Text:
ElementNode attribute = new ElementNode();
attribute = (ElementNode)stack.Pop();
element = (ElementNode)stack.Peek();
if (!element.startElement) {
writer.WriteStartElement(element.prefix, element.name, element.nameSpace);
element.startElement=true;
}
writer.WriteStartAttribute(attribute.prefix, attribute.name, attribute.nameSpace);
writer.WriteRaw(reader.Value);
reader.Read(); //jump over the EndElement
break;
case XmlNodeType.EndElement:
writer.WriteEndElement();
stack.Pop();
break;
case XmlNodeType.CDATA:
writer.WriteCData(reader.Value);
break;
case XmlNodeType.Comment:
writer.WriteComment(reader.Value);
break;
case XmlNodeType.ProcessingInstruction:
writer.WriteProcessingInstruction(reader.Name, reader.Value);
break;
case XmlNodeType.EntityReference:
writer.WriteEntityRef( reader.Name);
break;
case XmlNodeType.Whitespace:
writer.WriteWhitespace(reader.Value);
break;
case XmlNodeType.None:
writer.WriteRaw(reader.Value);
break;
case XmlNodeType.SignificantWhitespace:
writer.WriteWhitespace(reader.Value);
break;
case XmlNodeType.DocumentType:
writer.WriteDocType(reader.Name, reader.GetAttribute("PUBLIC"), reader.GetAttribute("SYSTEM"), reader.Value);
break;
case XmlNodeType.EndEntity:
break;
default:
Console.WriteLine("UNKNOWN Node Type = " + ((int)reader.NodeType));
break;
}
} while (reader.Read());
writer.WriteEndDocument();
reader.Close();
writer.Flush();
writer.Close();
}
// Use the WriteNode to simplify the process.
public void ConertToElementCentric(FileStream sourceFile, FileStream targetFile) {
XmlTextReader reader = new XmlTextReader(sourceFile);
reader.Read();
XmlTextWriter writer = new XmlTextWriter(targetFile, reader.Encoding);
writer.Formatting = Formatting.Indented;
do {
switch (reader.NodeType) {
case XmlNodeType.Element:
writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI);
if (reader.MoveToFirstAttribute()) {
do {
writer.WriteStartElement(reader.Prefix, reader.LocalName, reader.NamespaceURI);
writer.WriteRaw(reader.Value);
writer.WriteEndElement();
} while(reader.MoveToNextAttribute());
writer.WriteEndElement();
}
break;
case XmlNodeType.Attribute:
throw new Exception("We should never been here!");
case XmlNodeType.Whitespace:
writer.WriteWhitespace(reader.Value);
break;
case XmlNodeType.EndElement:
writer.WriteEndElement();
break;
case XmlNodeType.Text:
throw new Exception("The input document is not a attribute centric document\n");
default:
Console.WriteLine(reader.NodeType);
writer.WriteNode(reader, false);
break;
}
} while (reader.Read());
reader.Close();
writer.Flush();
writer.Close();
}
}
編譯過程式碼之後,在命令列上鍵入 <編譯後名稱> -a centric.xml <輸出檔案名稱> 加以執行。輸出檔案必須存在,它可以是空白檔案。
以下的輸出假設 C# 程式編譯為 centric_cs,命令列是 C:\centric_cs -a centric.xml centric_out.xml。
模式 -a 在告訴應用程式將輸入的 XML 轉換成以屬性為主的,而模式 -e 則將它變更成以項目為主。以下的輸出是使用 -a 模式所產生以屬性為主的輸出。目前項目包含屬性而非巢狀項目。
輸出為:centric_out.xml
<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<root>
<Customer firstname="Jerry" lastname="Larson">
<Order OrderID="Ord-12345">
<OrderDetail Quantity="1301" UnitPrice="$3000" ProductName="Computer" />
</Order>
</Customer>
</root>