Streaming From Text Files to XML
Quite some time ago, I wrote a blog post on how you can stream text files as input into LINQ queries by writing an extension method that yields lines using the yield return statement.
This blog is inactive.
New blog: EricWhite.com/blog
Blog TOCYou then can write a LINQ query that processes the text file in a lazy deferred fashion. If you then use the T:System.Xml.Linq.XStreamingElement to stream output, you then can create a transform from the text file to XML that uses a minimal amount of memory, regardless of the size of the source text file. You can transform a million records, and your working set will be very small.
The following text file, People.txt, is the source for this example.
#This is a comment
1,Tai,Yee,Writer
2,Nikolay,Grachev,Programmer
3,David,Wright,Inventor
The following code contains an extension method that streams the lines of the text file in a deferred fashion.
public static class StreamReaderExtension
{
public static IEnumerable<string> Lines(this StreamReader source)
{
String line;
if (source == null)
throw new ArgumentNullException("source");
while ((line = source.ReadLine()) != null)
yield return line;
}
}
class Program
{
static void <_st13a_place _w3a_st="on">Main<_st13a_place><_st13a_place><_st13a_place />(string[] args)
{
using (StreamReader sr = new StreamReader("People.txt"))
{
XStreamingElement xmlTree = new XStreamingElement("Root",
from line in sr.Lines()
let items = line.Split(',')
where !line.StartsWith("#")
select new XElement("Person",
new XAttribute("ID", items[0]),
new XElement("First", items[1]),
new XElement("Last", items[2]),
new XElement("Occupation", items[3])
)
);
Console.WriteLine(xmlTree);
}
}
}
This example produces the following output:
<Root>
<Person ID="1">
<First>Tai</First>
<Last>Yee</Last>
<Occupation>Writer</Occupation>
</Person>
<Person ID="2">
<First>Nikolay</First>
<Last>Grachev</Last>
<Occupation>Programmer</Occupation>
</Person>
<Person ID="3">
<First>David</First>
<Last>Wright</Last>
<Occupation>Inventor</Occupation>
</Person>
</Root>
Comments
Anonymous
December 13, 2007
Can you translate this code into VB? This is useless unless you're a C# coder.Anonymous
December 13, 2007
It is not very easy to translate this code into VB. It doesn't actually translate directly, as there is no yield return statement in VB. Instead, you have to write your own iterator, implementing the Current property, and the Reset and MoveNext methods. For more information on using the yield return keyword, and an example of an iterater that is implemented not using the yield return keyword, see: http://blogs.msdn.com/ericwhite/pages/The-Yield-Contextual-Keyword.aspxAnonymous
May 13, 2014
The comment has been removed