Freigeben über


TextReader based off C# Yield

Let's say you want to implement a text stream (via TextReader) over an arbitrary data store. For example, let's say for simplicity that we want to be able to instantiate the stream instance with an integer range and then get an string to represent that range like this:

<range>
<int>1</int>
<int>2</int>
<int>3</int>
</range>

The first hurdle is that TextReader requires you to implement both Peek() and Read() (which returns a single character). That's inconvenient because a single character does not map to the reader's overall state very well. It would be easier if the reader could just implement ReadLine() instead of Read() because in this case, it's easier to give back an entire line instead of just a single character.

I showed a helper class here, ReadLineTextReader, to create a complete TextReader around a single ReadLine() implementation instead of a Read() implementation.
Here's an implementation of such a text reader with a method like with the pertinent part highlighted in red:

     public class CounterXmlTextReader : ReadLineTextReader
    {
        int m_start;
        int m_current;
        int m_end;
        public CounterXmlTextReader(int start, int end)
        {
            m_start     = start;
            m_current   = start - 1;
            m_end       = end;            
        }

        // Override TextReader.ReadLine. Our base class, ReadLineTextReader, implements TextReader.Read()
        // using this ReadLine() function too.
        public override string ReadLine()
        {            
            int x = m_current++;
            if (x == m_start-1) return "<range>";
            else if (x <= m_end) return "<int>" + x + "</int>";
            else if (x == m_end+1) return "</range>";
            else return null; // eof            
        }
    }

(m_start, m_end) make up the range we're traversing which is (1,3) in this case. The m_current field is some local state for the current counter.
However, that implementation is kind of ugly and clearly would not scale well if you wanted to traverse a more complex data structure (like a tree). C#'s yield keyword can really help here. Imagine just saying this instead:

     // This is just like CounterXmlTextReader, but it is implemented with the "yield" keyword.
    public class RangeXmlTextReader : EnumeratorTextReader
    {
        int m_start, m_end;
        public RangeXmlTextReader(int start, int end)
        {
            m_start = start;
            m_end   = end;
        }
        protected override IEnumerator<String> GetEnumerator()
        {
            yield return "<range>";
            for (int i = m_start; i <= m_end; i++)            
                yield return "<int>" + i + "</int>";
            yield return "</range>";            
        }
    }

Now that's a nice improvement! And it's obvious how it would scale to much more complex structures.

Here's the sample code for the EnumeratorTextReader base class used above:

     // Helper class to build a  TextReader around a IEnumerator<string> collection.
    // The derived class implements a string collection (via IEnumerator<string>), which this class
    // then exposes each item via ReadLine() to a TextReader().
    public abstract class EnumeratorTextReader : ReadLineTextReader
    {
        IEnumerator<string> m_source;
        public EnumeratorTextReader()
        {
            m_source = this.GetEnumerator();
        }
        // Derived class implements a string collection, which this class exposes as a TextReader.
        protected abstract IEnumerator<string> GetEnumerator();

        // Override TextReader.ReadLine. Our base class, ReadLineTextReader, implements TextReader.Read()
        // using this ReadLine() function too.
        public override string ReadLine()
        {
            if (!m_source.MoveNext()) return null;
            return (string)m_source.Current;            
        }
    }

As a summary, here's a refresh on what the the class hierarchy looks like now, and what each layer contributes:
    RangeXmlTextReader - most derived class. Provides functionality via enumerator and yield.
    EnumeratorTextReader - connect derived enumerator to ReadLine()
    ReadLineTextReader - connect derived ReadLine() to base Read()
    TextReader   - base class that represents the interface we're implementing. Requires a Read() implementation.

Comments

  • Anonymous
    August 08, 2005
    The yield statement in C# V2 has a lot of similarities with the yield keyword in Cw (streams in COmega). It wouldn't be suprising if the yield keyword in C#V2 and Cw were built out of the same technical substrate (i.e. closure classes).
    In fact, I was able to verify this to a certain extent when studying about streams in Cw (comega)
    The writeup is available on www.omegaengine.com
  • Anonymous
    August 09, 2005
    Implementing an XmlReader is very difficult because there are over 25 abstract methods. Here's a simple way to change the problemspace to implement XmlReader with only 1 real method.
  • Anonymous
    August 09, 2005
    Implementing an XmlReader is very difficult because there are over 25 abstract methods. Here's a simple way to change the problemspace to implement XmlReader with only 1 real method.