Iterator Blocks, Part One

There is a constant tension in language design between solving general problems and solving specific problems; finding the right spot on the general-to-specific spectrum can be quite tricky.

The design of iterator blocks yields (ha ha) a germane example. At almost every step along the way, there are opportunities for making choices that determine whether a more general or a more specific problem is being solved.

Let’s start with the high-level design of the feature. Iterator blocks are, as the name implies, all about making it easier to write code that iterates over some collection of items in a natural way. This is a pretty darn general scenario; there are lots of potential collections (stacks, queues, lists, dozens of different kinds of trees…) containing arbitrarily many different types of items, and lots of ways to iterate over them (in order, post order, pre order, filtered, projected, grouped, sorted…)

We could have made it much more general. Our iterator blocks can be seen as a weak kind of coroutine. We could have chosen to implement full coroutines and just made iterator blocks a special case of coroutines. And of course, coroutines are in turn less general than first-class continuations; we could have implemented continuations, implemented coroutines in terms of continuations, and iterators in terms of coroutines.

But we didn’t. We decided that the sweet spot for the high-level scenario-driven design of the feature was iterators over collections, and so we concentrated on that. Some adventurous people use the fact that iterators are “poor-man’s coroutines” as a shortcut to building coroutine-like systems that have only a tenuous connection to the semantics of iterating over the items in a collection, but those are the exception, not the rule. Their scenarios certainly did not drive the design of the feature.

We want to balance the generality of the feature against the high cost of that generality. Premature optimization is often cited as the root of all evil, but I don’t think that’s quite true. Premature generality is responsible for a lot of evil too! As we’ll see, a lot of the time when faced with a design decision we take the YAGNI position; we choose to implement a little bit less generality to get a large cost savings, with the assumption that hardly anyone would benefit from that spending.

Over our next few fabulous adventures in coding I’ll discuss some of the reasons for the seemingly odd restrictions on the generality of iterator blocks – things like, why can there be no unsafe code in iterator blocks? Why can an iterator block not take a ref or out parameter? Why can’t you put a yield in a finally? And so on.

Comments

  • Anonymous
    July 09, 2009
    It's exactly that kind of attitude that has made for a programming language that feels light years more productive for building a very large class of applications than so many of its ancestor languages.  People don't realize that there's so much software to be built and so much hardware to throw around that the weak link in the chain is human comprehension. It strikes me that a similar analysis could be made of C# generics vs C++ templates.

  • Anonymous
    July 09, 2009
    Let me guess :-) Iterator blocks cannot take ref or out parameters because those cannot be warped into fields inside a helper class which implements the iterator interface. You cannot put a yield in a finally because you cannot goto into finally. Or maybe because that would break the invariant of finally clause: statements after yield might never be executed. You cannot put unsafe code in the iterator block because then you could run into a situation where one part of code that e. g. allocates non-GC-ed memory is executed, and the other part that frees it is not executed, leading to memory leaks and other scary unsafe and unmanaged stuff. (sorry for my English, I am not quite native speaker)

  • Anonymous
    July 09, 2009
    A similar argument could have been made about IDisposable and using blocks, The big problem I see with this is that .Net now has 2 specific structures (foreach and using) which both could have been implemented as continuations. The restrictive nature of the implementation, kind of hurts you, you can not intercept and track exceptions in using blocks, you can not pass around foreach blocks (or track exceptions)

  • Anonymous
    July 09, 2009
    The comment has been removed

  • Anonymous
    July 10, 2009
    What's the likelihood of being able to yield inside the try of a try/catch block in a future version? Or is there some basic reason why that can't make sense. Aside from that I think iterator methods are fine as they are - close enough to general coroutines, and lacking the "spooky" aspects of general continuations. As for yield return in a finally block, that would be odd because finally blocks run if the iterator is abandoned via Dispose. Yielding more items wouldn't be the sanest response to that.

  • Anonymous
    July 11, 2009
    What's the likelihood of getting working coroutines in a future version of C# or CLR?  the problem with using iterator syntax has been demonstrated by you before:    IEnumerator InOrderTraverseTree() {        if (this.left != null)        foreach(object o in this.left.InOrderTraverseTree())  yield return o;        yield return this.value;        if (this.right != null)        foreach(object o in this.right.InOrderTraverseTree()) yield return o;    } The problem being that you zip up and down the co-routine's call stack on every iteration step.   Don't get me wrong, the iteration syntax captures a nice pattern.  But people have noticed that it doesn't get you as far as 'something else', though maybe they don't know the name of that 'something else' yet, having never worked in a language that has coroutines or first class continuations. I'm curious if you'd agree, but I've always thought of the thing missing from iterator syntax is the method call abstraction.  In a hypothetical co-routine environment, instead of 'yield return x', you would 'push(x)', where 'push' is some sort of callable thing -- delegate in C# parlance -- where you could pass that delegate to recursive calls.    // in hypothetical C#++    [Coroutine(IEnumerable)]    void InOrderTraverseTree(Action<object> push) {        if (this.left != null)            this.left.InOrderTraverseTree(push);        yield return this.value;        if (this.right != null)            this.right.InOrderTraverseTree(push);    } Slightly improved readibility, drastically improved asymptotic performance.   One alternate approach I've heard suggested is chainable iterators -- that 'yield return' could be passed an IEnumerable to fold in.  I'm curious which you'd prefer.

  • Anonymous
    July 11, 2009
    The comment has been removed

  • Anonymous
    August 25, 2009
    The comment has been removed