Yield and usings - your Dispose may not be called!

We ran into an interesting bug recently where a resource was being leaked because we weren't disposing of an IDisposable in some cases. Before I go any further I should state that the root bug is that the IDisposable hanging onto the resource should have implemented a finalizer, or even better, used SafeHandle. However it wasn't our code.

We tracked down the bug to some code that looked like:

     static IEnumerable<Foo> MyEnumerable()
    {
        using (new MyDisposable())
        {
            // code that calls yield
        }
    }

Then, someone called MyEnumerable().GetEnumerator().MoveNext() without disposing it or wrapping it in a foreach. The result was that Dispose was never called on the disposable. I usually think of using guaranteeing that Dispose() gets called, except in extreme cases like thread abort exceptions. It sparked my curiosity to dig and figure out how this all works.

First, a couple of words about Enumerables, Enumerators and yield. An IEnumerator provides a simple way to iterate through a collection, using MoveNext() to advance to the next element, and a property Current to get the value at the current position. MoveNext() returns false at the end of the collection. An IEnumerable is simply an object you can call GetEnumerator() on to get an IEnumerator you can use to enumerate it (for example, collections are IEnumerables).

The yield keyword is a handy way to implement an IEnumerable that exposes the result of a series of calculations or operations in such a way that you can write serial code. I could spend a whole post on how cool yield is, but you can find that elsewhere. For a cool example, you can see the Fibonacci series implemented using yield. Note: yield is not a feature of the CLR, it's magical syntactic sugar provided by C# 2.0.

Let's take a simple example:

     class MyDisposable : IDisposable
    {
        public void Dispose()
        {
            Console.WriteLine("Disposed");
        }
    }

and the function:

     static IEnumerable<int> MyEnumerable()
    {
        using (new MyDisposable())
        {
            yield return 1;
            yield return 2;
        }
        Console.WriteLine("complete");
    }

If you walk the collection with the following code:

     IEnumerator<int> enumerator = MyEnumerable().GetEnumerator();
    while (enumerator.MoveNext())
    {
        Console.WriteLine(enumerator.Current);
    }

It will print "1", "2", "Disposed", "complete". But, how does it work? If you were to implement this without yield, you'd probably have to create a gnarly state machine, with some special cases for cleanup. Well, that's exactly what the C# compiler does for you!

Using .NET Reflector, we can see what code is generated. The MyEnumerable() function returns a new instance of a compiler generated class. Leaving out some of the details, the core of MoveNext() looks like:

     switch (this.<>1__state)
    {
        case 0:
            this.<>1__state = -1;
            this.<>7__wrap1 = new MyDisposable();
            this.<>1__state = 1;
            this.<>2__current = 1;
            this.<>1__state = 2;
            return true;

        case 2:
            this.<>1__state = 1;
            this.<>2__current = 2;
            this.<>1__state = 3;
            return true;

        case 3:
            this.<>1__state = 1;
            this.<>m__Finally2();
            Console.WriteLine("complete");
            break;
    }
    return false;

With:

     private void <>m__Finally2()
    {
        this.<>1__state = -1;
        if (this.<>7__wrap1 != null)
        {
            this.<>7__wrap1.Dispose();
        }
    }

So, if you walk through the states, you can see in case 0, it creates the MyDisposable, sets current to 1 and returns true. When it's called again in state 2, it sets current to 2 and returns true. When it's called a third time, it calls the finally function that disposes of the MyDisposable and then writes out "complete".

But, what happens in a partial enumeration? If you use foreach, like:

     foreach (int i in MyEnumerable())
    {
        Console.WriteLine(i);
        break;
    }

You might not expect it to call dispose. But, what it outputs is "1", "Disposed". The reason that this works is that the generated IEnumerator class also implements IDisposable where Dispose() calls <>m__Finally2(). And, foreach calls Dispose on its enumerator if it is an IDisposable.

However, in the case of our bug, we weren't using a foreach, we were just using MoveNext/Current directly without enumerating through the entire list. In this case, the MyDisposable is never disposed, even when GC takes place. Is this a bug? I've gone back in forth in my mind. On the one hand, the Enumerable code has a using, and that's supposed to clean things up! The compiler could have implemented a finalizer for the temporary class to clean up. On the other hand, the IEnumerator it generated was an IDisposable, which we failed to dispose. I tend to land on the side that this isn't a bug (and I bet the C# team agrees). If you have something that's holding onto a critical resource, you should implement a finalizer or use a SafeHandle and not rely on Dispose being called. However, I did find it a bit surprising. Now I know!

Comments

  • Anonymous
    March 14, 2008
    Let's say that the temporary class did implement a finalizer, what would it do?

  • Anonymous
    March 14, 2008
    PingBack from http://msdnrss.thecoderblogs.com/2008/03/15/yield-and-usings-your-dispose-may-not-be-called/

  • Anonymous
    March 15, 2008
    Hi Dan, Can you explain what the syntax "this.<>1__state"? I am not sure I saw the "<>" syntax before (apart from generics, of course). Greetings, Laurent

  • Anonymous
    March 15, 2008
    well its goes under category of unexpected behavior. Which is always a bad case for API. At least it should be well documented.

  • Anonymous
    March 15, 2008
    David - Good point. I was originally thinking finalize would call dispose, but finalize should only free unmanaged resources, and there aren't any here. Laurent - It's just the naming convention used by the compiler for the generated code. It's just the name of the field and doesn't mean anything special.

  • Anonymous
    March 26, 2008
    I think this is a bug. While it is always better to free IDisposable as early as possible, this shouldn't server as an excuse to not prevent a resource leak. According to MSDN, this is also the way IDisposable should be used: http://msdn2.microsoft.com/en-us/library/system.idisposable.aspx

  • Anonymous
    March 26, 2008
    Gregor, what exactly do you think is the bug and what would the fix be? If you are saying the bug was in the code that calls GetEnumerator and MoveNext without disposing the enumerator, I agree. If you are saying it's a bug in the way the Enumerator is generated, I don't think there's a bug there.

  • Anonymous
    April 02, 2008
    Great post, Dan! I agree that it's not a bug.  If the compiler-generated enumerator implements IDisposable, it's the developer's job to ensure that Dispose is invoked.  Whether that is accomplished by more compiler magic (i.e. foreach) or manually is up to the dev. Thanks, Josh

  • Anonymous
    April 02, 2008
    I recently had the same problem, we are using CF which doesn't support Linq to SQL (yes 3.5 does support Linq but no Linq to SQL :( ) So I implemented a simple one myself. The code went something like this: using (IDataReader reader = ...) { while (read.Read()) { yield return new ... } } The code was called and I used the easy to use Take(5) extension method on it (no expression trees either so can't generate simple SQL from it) to simulate a TOP 5. Exactly the same problem here, if there are more then 5 records the reader will never be disposed. So in the end we had to make sure the reader is always completely read :(

  • Anonymous
    April 03, 2008
    Interesting. That feels like a LINQ bug if they called GetEnumerator, but didn't dispose of the result.