Udostępnij za pośrednictwem


Iterator Blocks, Part Three: Why no yield in finally?

There are three scenarios in which code could be executing in a finally in an iterator block. In none of them is it a good idea to yield a value from inside the finally, so this is illegal across the board. The three scenarios are (1) normal cleanup, (2) exception cleanup, and (3) iterator disposal.

For the first scenario, suppose we have something like

try
{
Setup();
yield return M();
}
finally
{
yield return N();
Cleanup();
}

How should we transform this into an iterator state machine? Naively, we want to do something like:

switch (this.state)
{
case 0: goto LABEL0;
case 1: goto LABEL1:
case 2: goto LABEL2:
case 3: goto LABEL3:
}
LABEL0:
try
{
Setup();
this.current = M();
this.state = 1;
return true; // BUT DON'T RUN THE FINALLY!
LABEL1:
}
finally
{
this.current = N();
this.state = 2;
return true;
LABEL2:
Cleanup();
}
LABEL3:
this.state = 3;
return false;

There's an immediate problem with this: we both "goto" into a finally block and "return" out of one. Neither is legal.

Leaving a finally block via "return" is illegal because it is weird. Suppose the try block returns, causing the finally to execute. What happens when the finally block returns? The original return value would be lost and some other return value would be substituted for it. This seems bizarre and bug-prone, so its not legal; you cannot exit a finally via return. And of course, you don't want the finally to run after the return in the try in this case!

Furthermore, in the CLR model of exception handling it is illegal to branch via a goto into a try block or its "handler" (that is, the catch or finally clause). Nor can you branch out of the handler. These special regions have special code that needs to be run when the region is entered and exited; you cannot skip either with a goto.

So we have no immediately obvious way to generate verifiable code for this scenario. Right off the bat, we have a huge number of points against this feature; we'd have to either convince the CLR guys to allow spaghetti code involving protected regions, or come up with some clever technique for generating this code.

And that's just the scenario where nothing is going wrong yet! Suppose a miracle happens and we manage to successfully generate code for scenario one. Now consider our second scenario, where M throws exception X. Remember, the finally block "catches" the exception and processes the cleanup code. If the cleanup code throws, then the original exception is discarded and the new processing of the new exception takes over. If the cleanup code succeeds, then the original exception continues to be "thrown" up the call stack, looking for more finally blocks or catch blocks.

Suppose the cleanup code does not throw. What should the control flow look like? The caller calls MoveNext for the first time. M() throws X. The finally block takes over. The finally block calls N(), and returns control to the caller with the results of N() !? What happened to X, the exception? Is it just waiting there, in limbo? When the second call to MoveNext happens, should the cleanup code get run then, and X suddenly pops back into existence and continues to be thrown up the stack? That makes no sense at all; the two calls to MoveNext could have had completely different stacks! This is crazy, plus we have no mechanism at all in the CLR for these kind of shenanigans.

Third, suppose we manage to solve all these problems. Now consider what happens when the caller calls MoveNext, M() succeeds, control returns to the caller, and the caller calls an early Dispose() on the enumerator. (I guess they wanted only one item.) We generate a Dispose method that checks the current state and executes any pending finally blocks. What the heck is the Dispose method supposed to do when it encounters the yield return in the pending finally block? We're not even in a call to MoveNext anymore! Should we call N() and ignore the result? Should we return from the Dispose() after the call to N(), or do the cleanup and then return? What exactly should the control flow do here? We're in a context where we might not even be iterating anymore, and yet we're still yielding.

It really doesn't make any logical sense to do a yield return in this scenario; we're possibly not in a position where the caller is expecting things to continue to be yielded to it.

So in short, we'd have to do at least two impossible things in order to enable a scenario that makes no sense in the first place. If ever a feature called out to be cut at the design phase, this is it. Therefore: no yields inside finally blocks. Thank goodness for that.

Next time: now that you know all that, figuring out "why no yields inside catch blocks" is pretty straightforward.

Comments

  • Anonymous
    July 16, 2009
    Whoa, what an amazing post!  This kind of stuff really opens your eyes to the kind of problems you guys encounter when designing features.  The scenarios you describe in your post sound like a real nightmare. I'd love to be a fly on the wall when you guys discuss these things in your design meetings.  A podcast would be cool ;-) There's a point in the post where you mention having to convince the CLR guys to allow spaghetti code.  Do you run into situations where you have new requirements of the CLR often?  Or is this one of the many signs that a feature should be cut? We try to not drive features into the CLR, though it does happen occasionally. For example, the "Silverlight" CLR and the "desktop" CLR have slightly different security models for generated code, which makes dynamic code spitting via compilation of expression trees tricky in Silverlight; that language feature drove a change into the Silverlight CLR that was not anticipated in the design. Or, as we expose existing CLR features like generic variance, suddenly features that have been not heavily tested historically are suddenly in more widespread usage, which can trigger CLR changes to fix design or implementation warts. And sometimes a feature clearly requires CLR changes, like the "no PIA" feature. And its a two-way street. Sometimes the CLR guys come up with a great new CLR feature and then ask us if we'd like to provide a "surface" for that feature in our language. -- Eric Thanks for the excellent series of posts Eric, I look forward to the rest.

  • Anonymous
    July 16, 2009
    Wow, I never really paid attention to the "why not" for yield returns in iterator blocks, typically all you find in books are a straight copy/paste from the language spec/MSDN about where a construct cannot be used legally.

An iterator member's signature cannot contain any out or ref parameters.

An iterator cannot contain a return statement .

An iterator may not contain any unsafe code.

A finally block may not contain a yield return or a yield break statement.

A yield return statement may not appear in a try block that has a catch block.

A yield return statement may not appear in a catch block (yield break is OK)

Thanks for providing a very meaningful explanation about the "why" and "why not". Just a side note, most of the material in your blog is something one can never find in books,it's always insightful and stimulating reading your stuff. I always end up walking away with an "a-aha". :)

  • Anonymous
    July 16, 2009
    Only "A yield return statement may not appear in a try block that has a catch block" left unexplaned. That's part five. Be patient. -- Eric

  • Anonymous
    July 16, 2009
    The comment has been removed

  • Anonymous
    July 16, 2009
    // BUG? // // While examining yield in a try block that has a catch block, // I found that generated iterator code leaks implementation detail, namely: //   1) If iterator has finally block, Dispose() resets current pointer. //   2) If iterator has not finally block, Dispose() does not reset current pointer. // I.e., call site can reflect "Does iterator block uses finally block?". // // Method TestIterator() intentionally does not use foreach // and does not use break or continue, but it is still perfectly legal. // // My expectation was, that Dispose() will reset current pointer in any case. // // Tested with 3.5.30729.1 and 4.0.20506.1 C# compilers. // csc -target:exe -out:TestDisposeIterator.exe -codepage:1257 Program.cs using System; using System.Collections.Generic; namespace TestDisposeIterator {    public class Program    {        public static void Main(string[] args)        {            int sentry = 2;            Program.TestIterator(Program.IteratorWithFinally, sentry);            Program.TestIterator(Program.IteratorWithoutFinally, sentry);        }        private static void TestIterator<T>(IEnumerable<T> iterator, T sentry)            where T : struct, IEquatable<T>        {            Console.WriteLine("TestIterator: {0}", iterator);            using (IEnumerator<T> mover = iterator.GetEnumerator())            {                while (mover.MoveNext())                {                    Console.WriteLine("Current: {0}", mover.Current);                    if (mover.Current.Equals(sentry))                    {                        mover.Dispose();  // Instead of break.                    }                }            }        }        private static IEnumerable<int> IteratorWithFinally        {            get            {                try                {                    yield return 1;                    yield return 2;                    yield return 3;                }                finally                {                }            }        }        private static IEnumerable<int> IteratorWithoutFinally        {            get            {                yield return 1;                yield return 2;                yield return 3;            }        }    } } // TEST RESULTS: // // TestIterator: TestDisposeIterator.Program+<get_IteratorWithFinally>d__0 // Current: 1 // Current: 2 // TestIterator: TestDisposeIterator.Program+<get_IteratorWithoutFinally>d__4 // Current: 1 // Current: 2 // Current: 3

  • Anonymous
    July 17, 2009
    I think we should all yield to this explanation, and finally return to other aspect of the patters involving iterators.... <duck & run>

  • Anonymous
    July 19, 2009
    "Well, hold on a minute before you dismiss it." - I didn't dismiss it! You did, after much forethought. I just gave my first-minute impression of this restriction. I never really understood continuations. I mean, I know exactly how they work. I've worked with and understood CPS style programing for a while. But I never mastered it, I haven't (yet, hopefully) had that "aha" moment. Maybe for that reason, even if yield were done using continuations, I'd still think of them as a return from a function. So it would still seem obvious to me that it shouldn't work - even if it were an implementation detail of an implementation that doesn't exist! That said, I'm still having a hard time coming up with a real-world enough example where someone would want to yield in a finally, and the code would still make sense.

  • Anonymous
    July 20, 2009
    It's information like this that leads me back to your blog every week.  It's one thing to understand a language and a whole other thing to understand why the language is designed and works the way it does.  Reading and understanding information like this series is providing gives us users of C# a depth of knowledge that improves our coding skills.  Thank you!

  • Anonymous
    August 05, 2009
    The comment has been removed

  • Anonymous
    October 26, 2009
    Eric, since you've (in all the various Iterator Blocks posts, this is just the most relevant) explained how a lot of features have messy interactions with yield return and these interactions have been purposefully forbidden by the C# spec, why is yield return allowed inside a lock block?  Could you do a post on lock-yield interactions? See also http://msmvps.com/blogs/jon_skeet/archive/2009/10/23/iterating-atomically.aspx and the comments.