The Yield Contextual Keyword

[Blog Map]  [Table of Contents]  [Next Topic]

Yield Return is a means to more elegantly implement the plumbing for iteration.  Yield was introduced in C# 2.0, but my informal polling indicates that many developers don't yet understand it.  It's not hard, but it deserves some explanation.

This blog is inactive.
New blog: EricWhite.com/blog

Blog TOCUse of this construct is vital for LINQ.  Yield return allows one enumerable function to be implemented in terms of another.  It allows us to write functions that return collections that exhibit lazy behavior.  This allows LINQ and LINQ to XML to delay execution of queries until the latest possible moment.  it allows queries to be implemented in such a way that LINQ and LINQ to XML do not need to assemble massive intermediate results of queries.  Without the avoidance of intermediate results of queries, the system would rapidly become unwieldy and unworkable.

The following two small programs demonstrate the difference in implementing a collection via the IEnumerable interface, and using yield return in an iterator block.

With this first example, you can see that there is a lot of plumbing that you have to write.  You have to implement a class that derives from IEnumerable, and another class that derives from IEnumerator.  The GetEnumerator() method in MyListOfStrings returns an instance of the class that derives from IEnumerator.  But the end result is that you can iterate through the collection using foreach.

public class MyListOfStrings : IEnumerable
{
private string[] _strings;
public MyListOfStrings(string[] sArray)
{
_strings = new string[sArray.Length];

for (int i = 0; i < sArray.Length; i++)
{
_strings[i] = sArray[i];
}
}

public IEnumerator GetEnumerator()
{
return new StringEnum(_strings);
}
}

public class StringEnum : IEnumerator
{
public string[] _strings;

// Enumerators are positioned before the first element
// until the first MoveNext() call.
int position = -1;

public StringEnum(string[] list)
{
_strings = list;
}

public bool MoveNext()
{
position++;
return (position < _strings.Length);
}

public void Reset()
{
position = -1;
}

public object Current
{
get
{
try
{
Console.WriteLine("about to return {0}", _strings[position]);
return _strings[position];
}
catch (IndexOutOfRangeException)
{
throw new InvalidOperationException();
}
}
}
}

class Program
{
static void Main(string[] args)
{
string[] sa = new[] {
"aaa",
"bbb",
"ccc"
};

MyListOfStrings p = new MyListOfStrings(sa);

foreach (string s in p)
Console.WriteLine(s);
}
}

Using the yield return keywords, the equivalent in functionality is as follows.  This code is attached to this page:

class Program
{
public static IEnumerable<string> MyListOfStrings(string[] sa)
{
foreach (var s in sa)
{
Console.WriteLine("about to yield return");
yield return s;
}
}

static void Main(string[] args)
{
string[] sa = new[] {
"aaa",
"bbb",
"ccc"
};

foreach (string s in MyListOfStrings(sa))
Console.WriteLine(s);
}
}

As you can see, this is significantly easier.

This isn't as magic as it looks.  When you use the yield contextual keyword, what happens is that the compiler automatically generates an enumerator class that keeps the current state of the iteration.  This class has four potential states: before, running, suspended, and after.  This class has Reset and MoveNext methods, and a Current property.  When you iterate through a collection that is implemented using yield return, you are moving from item to item in the enumerator using the MoveNext method.  The implementation of iterator blocks is fairly involved.  A technical discussion of iterator blocks can be found in the C# specifications.

Yield return is very important when implementing our own query operators (which we will want to do sometimes).

There is no counterpart to the yield keyword in Visual Basic 9.0, so if you are implementing a query operator in Visual Basic 9.0, you must use the approach where you implement IEnumerable and IEnumerator.

One of the important design philosophies about the LINQ and LINQ to XML technologies is that they should not break existing programs.  Adding new keywords will break existing programs if the programs happen to use the keyword in a context that would be invalid.  Therefore, some keywords are added to the language as contextual keywords.  This means that when the keyword is encountered at specific places in the program, it is interpreted as a keyword, whereas when the keyword is encountered elsewhere, it may be interpreted as an identifier.  Yield is one of these keywords.  When it is encountered before a return or break keyword, it is interpreted by the compiler as appropriate, and the new semantics are applied.  If the program was written in C# 1.0 or 1.1, and it contained an identifier named yield, then the identifier continues to be parsed correctly by the compiler, and the program is not made invalid by the language extensions.

[Blog Map]  [Table of Contents]  [Next Topic]

Yield.cs

Comments

  • Anonymous
    November 22, 2007
    Maybe I'm missing something but I can do this with out the yield keyword or any other ienumerable code. For instance, the following code interates, using a foreach, over an arbitrary class named banana. I am using visual studio express 2008.            banana [] b = new banana[3];            b[0] = new banana(true);            b[1] = new banana(false);            b[2] = new banana(true);            foreach(banana h in b)                Console.WriteLine(h.isPeeled.ToString());

  • Anonymous
    December 10, 2007
    Brian, yield return is the means by which you create iterators. You can certainly write the code in an imperative style, but the point is, a declarative, FP style is faster to write and easier to maintain. But to write such code, you have to understand iterators, and moreover, how to write a function that is an iterator.

  • Anonymous
    January 18, 2008
    The comment has been removed

  • Anonymous
    January 18, 2008
    The comment has been removed

  • Anonymous
    August 10, 2008
    In the paragraph that starts with "This isn't as magic as it looks.", you state that the yield creates a new class that contains the Reset and MoveNext methods and a Current property. This is correct in the sense that the Reset method will throw an exception when called on an enumerator created by the compiler through the yield syntax. I just though I'd mention this. Regards, Meile

  • Anonymous
    September 11, 2008
    The comment has been removed