Поделиться через


Closing over the loop variable considered harmful

(This is part one of a two-part series on the loop-variable-closure problem. Part two is here.)


UPDATE: We are taking the breaking change. In C# 5, the loop variable of a foreach will be logically inside the loop, and therefore closures will close over a fresh copy of the variable each time. The "for" loop will not be changed. We return you now to our original article.


I don't know why I haven't blogged about this one before; this is the single most common incorrect bug report we get. That is, someone thinks they have found a bug in the compiler, but in fact the compiler is correct and their code is wrong. That's a terrible situation for everyone; we very much wish to design a language which does not have "gotcha" features like this.

But I'm getting ahead of myself. What's the output of this fragment?

var values = new List<int>() { 100, 110, 120 };
var funcs = new List<Func<int>>();
foreach(var v in values)
funcs.Add( ()=>v );
foreach(var f in funcs)
Console.WriteLine(f());

Most people expect it to be 100 / 110 / 120.  It is in fact 120 / 120 / 120. Why?

Because ()=>v means "return the current value of variable v", not "return the value v was back when the delegate was created". Closures close over variables, not over values. And when the methods run, clearly the last value that was assigned to v was 120, so it still has that value.

This is very confusing. The correct way to write the code is:

foreach(var v in values)
{
var v2 = v;
funcs.Add( ()=>v2 );
}

Now what happens? Every time we re-start the loop body, we logically create a fresh new variable v2. Each closure is closed over a different v2, which is only assigned to once, so it always keeps the correct value.

Basically, the problem arises because we specify that the foreach loop is a syntactic sugar for

{
IEnumerator<int> e = ((IEnumerable<int>)values).GetEnumerator();
try
{
int m; // OUTSIDE THE ACTUAL LOOP
while(e.MoveNext())
{
m = (int)(int)e.Current;
funcs.Add(()=>m);
}
}
finally
{
if (e != null) ((IDisposable)e).Dispose();
}
}

If we specified that the expansion was

try
{
while(e.MoveNext())
{
int m; // INSIDE
m = (int)(int)e.Current;
funcs.Add(()=>m);
}

then the code would behave as expected.

It's compelling to consider fixing this for a hypothetical future version of C#, and I'd like to hear your feedback on whether we should do so or not. The reasons FOR making the change are clear; this is a big confusing "gotcha" that real people constantly run into, and LINQ, unfortunately, only makes it worse, because it is likely to increase the number of times a customer is going to use a closure in a loop. Also, it seems reasonable that the user of the foreach loop might think of there being a "fresh" loop variable every time, not just a fresh value in the same old variable. Since the foreach loop variable is not mutable by user code, this reinforces the idea that it is a succession of values, one per loop iteration, and not "really" the same variable over and over again. And finally, the change has no effect whatsoever on non-closure semantics. (In fact, in C# 1 the spec was not clear about whether the loop variable went inside or outside, since in a world without closures, it makes no difference.)

But that said, there are some very good reasons for not making this change.

The first reason is that obviously this would be a breaking change, and we hates them, my precious. Any developers who depend on this feature, who require the closed-over variable to contain the last value of the loop variable, would be broken. I can only hope that the number of such people is vanishingly small; this is a strange thing to depend on. Most of the time, people do not expect or depend on this behaviour.

Second, it makes the foreach syntax lexically inconsistent. Consider foreach(int x in M()) The header of the loop has two parts, a declaration int x and a collection expression, M(). The int x is to the left of the M(). Clearly the M() is not inside the body of the loop; that thing only executes once, before the loop starts. So why should something to the collection expression's left be inside the loop? This seems inconsistent with our general rule that stuff to the left logically "happens before" stuff to the right. The declaration is lexically NOT in the body of the loop, so why should we treat it as though it were?

Third, it would make the "foreach" semantics inconsistent with "for" semantics. We have this same problem in "for" blocks, but "for" blocks are much looser about what "the loop variable" is; there can be more than one variable declared in the for loop header, it can be incremented in odd ways, and it seems implausible that people would consider each iteration of the "for" loop to contain a fresh crop of variables. When you say for(int i; i < 10; i += 1) it seems dead obvious that the "i += 1" means "increment the loop variable" and that there is one loop variable for the whole loop, not a new fresh variable "i" every time through! We certainly would not make this proposed change apply to "for" loops.

And fourth, though this is a nasty gotcha, there is an easy workaround, and tools like ReSharper detect this pattern and suggest how to fix it. We could take a page from that playbook and simply issue a compiler warning on this pattern. (Though adding new warnings brings up a whole raft of issues of its own, which I might get into in another post.) Though this is vexing, it really doesn't bite that many people that hard, and it's not a big deal to fix, so why go to the trouble and expense of taking a breaking change for something with an easy fix?

Design is, of course, the art of compromise in the face of many competing principles. "Eliminate gotchas" in this case directly opposes other principles like "no breaking changes", and "be consistent with other language features". Any thoughts you have on pros or cons of us taking this breaking change in a hypothetical future version of C# would be greatly appreciated.

(This is part one of a two-part series on the loop-variable-closure problem. Part two is here.)

Comments

  • Anonymous
    November 11, 2009
    If you have a good understanding of lambdas and closures, this code does exactly what you would expect it to do (nothing like your foreach (var m in from m in data orderby m select m) from a few days ago). It seems like changing it would do far more harm than good.  I think you adequately summed up the arguments in favor of not changing it. In particular, if you use re-sharper (and pay attention to it) you'll catch this right away.  Perhaps you should just buy JetBrains and integrate their product?  :P

  • Anonymous
    November 11, 2009
    Please do NOT change this, it is 100% correct. If people do not understand "Closures close over variables, not over values" then it is a matter of training [I actually use a similar sample in my assessment screening for midlevel and senior developers] Remember the imortal words of Scott Meyers from nearly 20 years ago (has it really been that long?).."Do not Mollycoddle your users (programmers in this case)"

  • Anonymous
    November 12, 2009
    You should definitely do something about this.  I see it on StackOverflow all the time. I like the "Add a compiler warning" idea. I like it a lot, because it side-steps the breaking change and still throws something in front of the user to help avoid the mistake. As for the "lexical" argument, I think the stronger argument is that the declaration of the variable occurs outside of the { } block.  Consider: " { int x; } " versus " int x; { } ".  This code feels much more like the latter.  Since you declare the variable outside of the scope block, obviously it should be the same variable.  Changing it to something else would be deeply weird.

  • Anonymous
    November 12, 2009
    Initially I was all for making this change. I haven't ever seen a legitimate use of closing over the iteration variable (except as a demonstration of why you shouldn't). However, I hadn't thought about the consistency issue with for loops. Considering that, I would argue that consistency wins. Reducing the cognitive load of understanding a language is highly valuable, and having these constructs behave differently in this regard seems like a recipe for confusion. Definitely do make a warning for this though. Surely in upwards of 99% of cases closing over the iteration variable is not going to produce the result the programmer expects.

  • Anonymous
    November 12, 2009
    Please do not change this.  The behavior is consistant with the rest of the language.  I can't think of another case where declaring a variable actually declares a bunch of variables that appear to point to the same data but don't.

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    I'd say change. The strongest argument I can think of is that it is inconsistent with seemingly equivalent LINQ:   var funcs = new List<Func<int>>();   funcs.AddRange(from i in Enumerable.Range(0, 10) select(Func<int>)(() => i));   funcs.ForEach(f => Console.WriteLine(f())); This prints 0 to 9, as expected. It's also a counter-argument to this claim: > So why should something to the collection expression's left be inside the loop? This seems inconsistent with our general rule that stuff to the left logically "happens before" stuff to the right. The declaration is lexically NOT in the body of the loop, so why should we treat it as though it were? Because you already do just that in at least one place in the language. The other strong argument is that the existing behavior is deeply unintuitive. Vast majority of C# developers get it wrong the first time they use lambda inside a foreach (and many get it wrong repeatedly even after they already know how it actually works). This is a clear sign that it's not a good approach. Finally, the existing behavior is inconsistent with regard to pseudo-readonly status of the range variable. For normal readonly fields, I can expect that not only I cannot write to them, but noone else can, either (except during initialization). Foreach range variable is kinda weird - it's readonly to my code, but there's some other magic code there that can change it. IMO, a much more straightforward definition is as a true readonly local, which it can be if it's "auto-declared" inside the loop body. > Please do NOT change this, it is 100% correct. If people do not understand "Closures close over variables, not over values" then it is a matter of training People do understand "close over variables" just fine. They don't understand the lifetime of the range variable. > I can't think of another case where declaring a variable actually declares a bunch of variables that appear to point to the same data but don't. Uh, how about any time you explicitly declare a local inside the body of the loop?

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    I agree with Mike Van Til, if you understand how it all works, it only makes sense that it should work that way. I ran into this problem about a month ago, and once I discovered the source of the bug, I immediately realized what was going on and thought it made perfect sense. Although I use resharper with visual studio, at the time I was using monodevelop. If anything is done about this at all, I like the idea of a compiler warning.

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    @Eric, I think I poorly phrased a portion of my response.  That is what I get for hurriedly writing responses while at work! Anyway, I really intended to convey that if the foreach behaved as proposed, it would "feel" as if the foreach were closing over the values instead of the variables, which could lead one to expect similar behavior out of the sample program.  I certainly did not intend to suggest that you were proposing that lambdas would close over values.

  • Anonymous
    November 12, 2009
    @Pavel - [IMHO] programmers should realize that:      foreach(var x in y) should be "read" as:      for(var x assign each element from y)

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    > "programmers should realize that" Why should they? I understand what you probably mean - that someone coding in C# should know this particular thing about the language. But we aren't talking about knowing what's there, but rather about the "perfect" state of affairs, and then I think that it's best to stick to most intuitive interpretation of the keyword. It doesn't say "assign" anywhere, or "=", or anything else to imply assignment. In any case, the strongest argument is still the existing mismatch in lifetime of range variable between loop "foreach (var x in xs)" and LINQ "from x in xs". Everything else is just the lack of icing on the cake that isn't there.

  • Anonymous
    November 12, 2009
    In Javascript, I generally resort to a 'double closure' - this is particularly needed in JS as nested functions are the only way to create a new scope block. funcs.Add((function(v2){return function(){return v2};})(v)); translated to mostly equivalent [except i'd have to know the type of v, using int here] C# funcs.Add(((Func<int,Func<int>>)((v2) => () => v2))(v));

  • Anonymous
    November 12, 2009
    I have to agree with Pavel on all his responses, especially the one about the plain meaning of foreach.  It would be fantistic and best if foreach actually meant "for each item in the collection."  I thought Eric's second reason was particularly weak as the foreach statement already has the problem of the right side being evaluated before the left as pointed out in the last post about "foreach (var m in from m in data orderby m select m)". That being said, I believe Eric's reason number 1 "breaking change" out ways everything else.  The train has already left the stop, it's too late to get on board.

  • Anonymous
    November 12, 2009
    If the lifetime of the range variable in foreach was to be changed, then it would not only conflict with a simple "for" it would conflict with the meaning for foreach in other environments. IF consistancy was to be the prime driver [I do NOT believe it should be] then LINQ as the new-comer to the block would be the one to change [I DONT think it should be changed] Consider code:    for (var x=C.First(); x<=C.Last(); x=C.Next) {} and    foreach (var x in C) {} If these behaved differently many more [IMHO] people would be confused. ps: I DO expect people to study and learn a language over a period of time [months to years], I am not a believer in the idea of picking up a programming language in "21 days" [even though I have been the Technical Editor of such books <grin>]. In general I would expect a person to have about 3 months [ about 225 hours] of study/usage of C# before they ever see LINQ or Lambdas....

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    i don't believe it really does matter, i think in both of situations one should  (i mean not intuitively, but  by knowledge) know what he is using, and how it behaves. but... In the 'change' case you'll have allowed someone (most people by most of your sayings) code and it works fine, but they don't know what is under the hood...  and that's what i'll never appreciate. if you say how 'foreach' reads naturally,  (i dont want to be arguing readibility) than i remind its a formal language . if you stick on with "i suppose this work by this way"ers to change this,,, uueh , that s another thing.

  • Anonymous
    November 12, 2009
    IMHO the current behavior is fine, although a bit confusing the first time you realize it. However a compiler warning would be nice...

  • Anonymous
    November 12, 2009
    The foreach variable and the range variable in a from clause are conceptually nearly identical, so I think they should be as similar in behaviour as possible. Declaring the foreach variable inside the generated while loop would be a great step in that direction.

  • Anonymous
    November 12, 2009
    Have you considered the possibility that the people who submitted these bug reports: A) don't understand lambda expressions instead of B) don't understand foreach No, you're right. Usually its a failure to understand that lambdas close over variables, not values. But who cares? The point is that people are writing broken code because the language semantics do not match people's intuition in some way. My goal is not to write a language which punishes people for failing to understand the nuances of closure semantics; my goal is to write a language that naturally encourages writing correct code, even when the developer has a slightly incorrect mental model. Believe me, plenty of developers have incorrect mental models of how C# works in almost every respect. A language that can only be correctly used by experts isn't what we're after here. -- Eric

  • Anonymous
    November 12, 2009
    > Have you considered the possibility that the people who submitted these bug reports: > A) don't understand lambda expressions > instead of > B) don't understand foreach If people were simply misunderstanding lambdas, we'd see a similar amount of confusion when capturing for-loop counters, but it doesn't seem to be the case (at least judging by the number of SO questions). In fact, I've yet to see anyone being confused about for-loops in a similar case; it's invariably foreach only.

  • Anonymous
    November 12, 2009
    Pavel - I believe that is because most people who are using for-loop are not using lambdas and that this accounts for the skew. If you don't mind please contact me privately to discuss further... My contact info is in my MVP Profiile... https://mvp.support.microsoft.com/profile=9CAAC837-1258-4B5F-B6D0-7D9CD800EBBB

  • Anonymous
    November 12, 2009
    I'm strongly in favor of making this change. The main reason is not so much that the current behavior is unintuitive - which it is, even when you understand the reasons for it - it's that it's unUSEFUL. If there were two possible things you might want to do - even if one of them was more common than the other - the argument against breaking changes would clearly be a reason not to ever consider making the change, because you never want to break working code. But as it stands, the current behavior, while "consistent" in some senses (and I really don't buy most of the consistency arguments; the variable is readonly so it's inconsistent that it changes, too), is useful for nothing at all. C# had the good sense to not keep the C behavior where loop variables are still in scope after the loop is finished and holding their last value around - because it's confusing and not useful and leads to bugs. This is similar. I suggest that if you aren't going to actually change the scope of the loop variable in this case, I'd go one stronger than a warning and make attempts to close over a foreach loop variable actually an error. Someone who wants the weird current behavior for some reason can still get it: int sharedValue; foreach (var value in values) {  sharedValue = value;  doSomething(() => sharedValue); } Basically, making it an error forces the programmer to disambiguate. Making it work the "expected" way would be preferable, as far as I'm concerned, but if there's too much resistance to that, at least require the programmer to spell out what they want if they want something so deeply counterintuitive.

  • Anonymous
    November 12, 2009
    I would have to agree with not changing the current behavior. The current behavior also matches what happens when you attempt the same thing in javascript: var a = [1, 2,3,4] var funcs = [] for (var i in a) {     funcs.push(function() { return i; }); } for (var f in funcs) {   console.log( f() ); } will print out "4 4 4 4". Its helpful to think as lambdas and closures working on a semantic scope;  A method is a natural definition for semantic scope (I'm just kind of winging that definition; I am not a compiler guy), so in JS the preferred solution is to define another lambda that closes around what you need like this: for (var i in a) {     (function(x) {          funcs.push(function() { return x; });     })(i); } --OR-- for (var i in a) {    funcs.push(function(x) {       return function() { return x; };   }(i)); } Fiirst class functions are a beautiful thing. :-)

  • Anonymous
    November 12, 2009
    > The current behavior also matches what happens when you attempt the same thing in javascript ... a method is a natural definition for semantic scope Not in any C-derived language with the notable exception of JavaScript. Natural definition of scope was always block - {} - and never the entire function. In fact, it seems to be one of the most hated things about JS:    var x = 1;    if (...) {      var x = 2; // no error    }    // x is now 2 It's bad enough that they've added "let(x=1) {...}" specifically to support better scoping in JS1.7 (https://developer.mozilla.org/en/New_in_javascript_1.7#let_statement).

  • Anonymous
    November 12, 2009
    Avoid writing code like this at all, regardless of how C# handles it. But if you must write code like this to impress your friends, C# appears to be handling things correctly as is.

  • Anonymous
    November 12, 2009
    Parallel.ForEach in c# 4.0 behaves like the proposed change to foreach, not like the current implementation.  It also somewhat breaks the scoping for the loop variable. In fact, implementing my own Parallel.ForEach (because I'm impatient) was the first time I ran across this unexpected behavior.  The behavior so surprised my coworker that he bet me that wasn't it.  This seems like an argument for changing the behavior, however it's also the kind of thing that if you've seen it once, you'll learn your lesson and not repeat the mistake - and get a better understanding of closure semantics in the process - so I'd favor a warning over any kind of breaking behavior change.

  • Anonymous
    November 12, 2009
    And I think we should paint the bike shed cornflower blue!

  • Anonymous
    November 12, 2009
    Initially I wanted the warning, but now I think .. just change it. My reason has nothing to do with consistency or the backward change. Apparently iterating through a collection and making a bunch of lambdas is common.  (Otherwise the bug would not come up.)  Everyone seems to agree that what is wanted is not what happens.  If the syntax exists and the desired semantic is clear, just do what I want and be over it.  Why should I have to clutter up my code with an extra local variable just to show you that I want what you already know I wanted?   I like the semantic idea that a foreach loop runs once on each element of a collection, not on a variable that changes with each iteration.  This is very different from the semantic of a for loop in which you explicitly create and increment a single variable.  Consistency is not always a virtue -- why should dissimilar ideas be forced to be consistent? I have another question about backwards compatability.  I suspect that most of the code using this pattern right now is buggy code that the bugs haven't been found yet.  It is clear that backwards compatability requires correct code to continue tp work correctly -- does it require buggy code that happened to work by chance to continue to do so?

  • Anonymous
    November 12, 2009
    I vote for a compiler warning at least, or a breaking change to the language at most. This bug has biten some of our devs, who are by no means C# rookies, but at least mid-level devs. A compiler warnings seems like good fit to this problem.

  • Anonymous
    November 12, 2009
    @Stuart - FYI: C++ for loop variables existing past the scope of the loop has NEVER been part of the specification. It was a common bug in many compilers [including VC++ upto and including 6.0].

  • Anonymous
    November 12, 2009
    After reading the additional responses, I like Stuart Ballard's answer the best.  Instead of making a WARNING message or "fixing" the problem, "make attempts to close over a foreach loop variable actually an error."  This would make people fix buggy code and make a program "breaking change" into a compile time breaking change.  If the error is well documented it should get rid of all the "incorrect bug reports", force the fixing of a potential bug, and educate the developer at the same time.  Sounds like a WIN-WIN-WIN to me. By the way, great post and fantastic responses all around.

  • Anonymous
    November 12, 2009
    I'm with Pavel: the "each" is stronger. The natural expectation, to someone actually reading the code (as opposed to a commenter trying to be pedantic about the expansion), is that the item variable should be immutable and scoped to that loop iteration. I was shocked the first time ReSharper warned me about this (and it was right, I had exactly the bug it was warning me about). And when I tell ReSharper to use a different color to highlight mutable variables, every foreach variable gets highlighted. This is not reasonable behavior, especially as C# borrows more and more from functional languages where immutability is the norm. As to your arguments about the syntax suggesting that the variable declaration happens first... that just means the syntax is wrong. To steal from another language I'm familiar with, Ruby chose a syntax that doesn't suffer from that ambiguity: collection.each do |item|  ... end I.e., call the collection's "each" method and pass it a lambda. I know it's a little late to rework the syntax of C#'s foreach keyword, but if LINQ shipped an .Each or .ForEach(Action<T>) method, then C# could use the exact same pattern, with the same clarity: collection.ForEach(item => {  ... }); The extra parentheses around the lambda are yucky, but this would have the desired scoping; and the scoping would be clear at a glance, even to someone unfamiliar with the esoteric details of the expansion. If the performance overhead of the extra method call was a concern, the compiler could always look for this pattern and inline the same codegen as foreach does today (but with the right variable lifetime). I'm told that's what most Smalltalks do with their "if" syntax -- syntactically it looks a method call that takes two closures, but in real life it's optimized to use native branch instructions just like you'd expect in any other language. My vote is to add an .Each or .ForEach method to LINQ, and then add a compiler warning for this pattern. (If it were me, I'd make sure the text of the warning recommends that you use .Each instead for clarity. But that's just me.)

  • Anonymous
    November 12, 2009
    +1 on adding a compiler warning, but absolutely not changing the current behaviour.

  • Anonymous
    November 12, 2009
    I'm in favor of making the change. Here: foreach (int x in GetNumbers()) { ... } Is x assigned before GetNumbers is called? Of course not. So something on the left doesn't necessarily happen before something on the right. Now, if I were to exapnd it manually (forget error checking and dispose for now) I'd do: var iterator = GetNumbers().GetEnumerator(); while (iterator.MoveNext()) {    int x = iterator.Current;    ... } Most people I believe would do the same thing. I've commented asking why the compiler does this weirdly on your previous post, and your response was (I'm paraphrasing): It only makes a difference in closures, and even than it's for the worse. I say make the change. But maybe someone is using that oddity on purpose? I have thought of a theoretical reason to do that, although I'd do this completely differently: bool first = true; foreach (int x in GetNumbers()) { // returns numbers 1..Max    if (first) {        UpdateProgress(x);        first = false;    }    DoLongOperation(x); } void UpdateProgress(int x) {    while (x < Max) {        UpdateProgressBarInUI(x);        Thread.Sleep(500);    } }

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    There are situations were closing over a loop variable does not create 'unexpected' results in eighter the current or the proposed situation, like in: foreach(var item in list) {    Console.WriteLine(otherlist.where(i => i.name == item).Count()); } As long as the lamba is not executed outside of the current loop itteration there is not really a problem. I think this is a pretty common and valid scenarion that should not yield a warning. problem is ofcourse that for the compiler it will be harder to detect if the lamda 'surives' the loop body.

  • Anonymous
    November 12, 2009
    Can you just ask Anders? The answer seems clear, just wondering if C# losing its momentum in conversations abstracting out into account its core metaphors and principles. Just come back to English: 'for each apple x in the basket'. Each x, come on! I think the designers of C# should feel more contact with the ground, do not leak simplicity and original C# spirit in favour of obscure synthetic ideas.

  • Anonymous
    November 12, 2009
    Since both sides (language purism vs. practical application) have compelling arguments, why not allow for both scenarios? SomeType x; foreach(x in collection) { ... } foreach(SomeType y in collection) { ... } If you want the behavior that the iteration variable should be declared exactly once, then you have to pay the price of stating that intent explicitly and live with the fact that your variable is declared in the containing scope. If you want to iterate through a collection and have it "just work," then you use the second syntax, which would declare the variable in the loop body since you didn't take the extra step of declaring it yourself elsewhere.  This aligns well with the code you would have otherwise written by hand: for(int index = 0; index < upperBound; index++) {  SomeType y = collection.ItemAt(index);  // do stuff with y and never touch index or upperBound } using(IEnumerator e = collection.GetEnumerator()) {  while(e.MoveNext())  {    SomeType y = (SomeType)e.Current;    // do stuff with y and never touch e  } }

  • Anonymous
    November 12, 2009
    This might be opening an entirely new can of worms... but what if we added some sort of syntax to the foreach construct that specified whether the range variable is declared outside (like it is currently) or inside (like the proposed change) the body of the loop. Consider: var values = new List<int>() { 100, 110, 120 };
    var outerFuncs = new List<Func<int>>();
    var innerFuncs = new List<Func<int>>();
    foreach(var v in values)
    {
       // default behavior (variable declared outside loop body)
       outerFuncs.Add( ()=>v );
    } foreach(var v in values) inner
    {
       // new behavior (variable declared inside loop body)
       innerFuncs.Add( ()=>v );
    } foreach(var f in outerFuncs)
    {
       Console.WriteLine(f());
    } // 120 120 120 foreach(var f in innerFuncs)
    {
       Console.WriteLine(f());
    } // 100 110 120

  • Anonymous
    November 12, 2009
    Also note, that the current implementation DEMANDS use of variable names such as 'v1'. I am sure there's a lot of people who will argue 'v1' is an excellent and natural thing to have.

  • Anonymous
    November 12, 2009
    My immediate thought was to leave it as it, and avoid the breaking change, despite the fact that I view the variable declared in the foreach as immutable. However, I tested the following. var funcs = new List<int>() { 100, 110, 120 }.Select(v => () => v).ToList();
    foreach (var f in funcs)
    {
       Console.WriteLine(f());
    } And found that it prints 100, 110, 120. I don't know much about the internals of the Select extension method, but I would have thought it used something similar to foreach. I guess it uses enumerators somewhat more explicitly with a new variable declared for each item. Your code is the moral equivalent of: foreach(int x in intList) funcs.Add(MakeMeAFunc(x));
    ...
    Func<int> MakeMeAFunc(int y) { return ()=>y; } See why that works? Every time through the loop you create a new variable y by calling MakeMeAFunc. The closure is closed over the parameter, and the parameter is a fresh new variable on each invocation. (Imagine if it wasn't; recursive functions would be nigh-impossible!) -- Eric To be honest, knowing what happens with foreach, I really didn't know what to expect from the above code. I'm glad it did what I thought it should do though.

  • Anonymous
    November 12, 2009
    My preference would be to allow 'val' as an alternative to 'var' for an implicitly typed local value. When used in a foreach loop this would cause the delegates to capture the value of the iteration variable. Admittedly this is a much larger change.

  • Anonymous
    November 12, 2009
    "Closures close over variables, not over values".  It's not a matter of variables vs. values, it's a matter of scope.  Placing this text in bold will not help. IIRC, C# forbids the programmer from modifying this variable.  It's no wonder programmers are surprised by this behavior -- one moment the variable is immutable, and the next moment the language pulls the rug out from beneath you and rebinds the unrebindable variable. Perhaps you should put something in bold to that effect.

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    Duh, I’t like to clarify a point which was incorrect from my previous posting: Of course the lambda shouldn’t close over the value if a loop variable i because that’s precisely what lambdas don’t do. Rather, my point should have been that classical for loops are just a big problem when it comes to lambdas and their inevitable functional mindset. A variable which is “recycled” in this manner just doesn’t sit well with functional programming. It would be much cleaner if the conventional for loop actually did create a new variable and initialize it with the value of the old one – plus one. So, all things considered, the behaviour of the classical for loop probably shouldn’t be changed after all. Much better would be to deprecate the old-style for loop (which of course won’t happen) – Python demonstrates that one can live very well without one, even in an imperative language (foreach (var i in Enumerable.Range(1, 10) …).

  • Anonymous
    November 12, 2009
    My vote would be to make the change and to break consistency between for and foreach in this matter. In any case, I don't think for is really all that similar; for iterates "over" an indexer, and logically increments it all the time; whereas foreach hides the indexer (i.e., the enumerator) and shows you only the indexed value. In short, the index in a typicial for loop is not at all the same as the value in a typical foreach loop. Another reason to make the change is that the "correct" version currently looks like duplicated code - and correct code should look correct if at all possible.

  • Anonymous
    November 12, 2009
    Issue a warning in C#4, and change it in C#5.

  • Anonymous
    November 12, 2009
    ++ to compiler warning. I understand the logic fully but I've still accidentally done it myself.

  • Anonymous
    November 12, 2009
    Please don't change it! This forms the basis of an interview question I give to developers. If you change it I'd have to come up with a new question and I'm very lazy... Do you find that this is a good interview question? It seems like an odd corner case; I wouldn't expect most developers to know about this off the top of their heads, though I would expect them to be able to figure out what was going on eventually. -- Eric A warning along the lines of ReSharper's "access to modified closure" would be good though.

  • Anonymous
    November 12, 2009
    What about nonbreaking change: foreach (new var iten in sequence) { /* ... */ }


Excerpt from MSDN "from clause (C# Reference)" ... The range variable is like an iteration variable in a foreach statement except for one very important difference: a range variable never actually stores data from the source. It just a syntactic convenience that enables the query to describe what will occur when the query is executed. ... Therefore closures close not over range variables but over items in the sequence.

  • Anonymous
    November 12, 2009
    I would avoid a breaking change like this, because the behavior change would be so subtle that developers will spend days before understanding why their code suddenly stopped to work. Think about this, you have 3 types of developers who are using this scenario:
  1. I wrote my code, I rely on the fact that var v is always the same, I am happy
  2. I wrote my code, it works, by pure luck, because var v is always the same, I am happy
  3. I wrote my code, it doesn't work because var v is not new, I research/fix my code, I am happy (could be happier if foreach worked like I want, but still...) The breaking change would make developer 2 really, really, really unhappy :-) Have you considered adding a new syntax? Something like this, that would be translated in the alternative code with the declaration inside the loop: foreachnew(var v in M()) foreach(new v in M()) foreach(new var v in M())
  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    I only disagree with the second statement about the lexical inconsistency that the change would cause in the “foreach” command. The terms of the foreach command are stronger that the “left to right precedence” concept. I agree with some of the comments above that says that the "each" and the "in" terms in the command implies that the scope of the variable should occur within the enumeration loop. The code below makes more sense if you consider that each m is in the enumeration: …      while(e.MoveNext())      {        int m; // INSIDE        m = (int)(int)e.Current;        …      } … It’s not a single variable. It’s several, each one occurring inside the loop. So I’m in favor of the change. For those who need to maintain the previous behavior I propose a new command: iterate({variable} through {collection}) that command would make clear that a single variable would iterate through the whole collection receiving it’s value.

  • Anonymous
    November 12, 2009
    Delegates Are Easy, Closure Requires Craftsmanship. If this is true, and I think it is, a compiler or tool warning leaves full control in the hands of your customers. I completely sympathize with the reality that the overwhelming bugged bug report is based on misunderstanding of closure. Warnings seem to be the best you can do in this situation. Perhaps mentioned earlier (no time to read every single comment), this "breaking change" would have to be propagated to other "foreach" iterators, no? For example rewriting the code above using List<int>.ForEach would have to work the new way as well. I have no fear that you guys would introduce the change properly - and I would learn the special case - but it seems like a lesser option than the warning approach.

  • Anonymous
    November 12, 2009
    this should be fixed but just cannot be. i guess 1000s of people depend on this without knowing.

  • Anonymous
    November 12, 2009
    I agree with Mike Van Til and the others stating the same thing.  Once you read it and think about what is going on, it makes perfect sense.  I would really prefer not to change it as that would break previous code, and make inconsistencies in how things function.  

  • Anonymous
    November 12, 2009
    I love the fact that you're looking at pain points and attempting to eliminate them.  I have found this gotcha in many other languages and usually people give the standard answer, "That's the way it is."  I think new language constructs can deal with this rather well.  For instance, I think the ForEach method is much clearer and does what the user expects.  For instance: values.ForEach(v => funcs.add( () => v)); funcs.ForEach(f => Console.WriteLine(f()));

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    This is a weak point, but I distinctly remember my thinking before I web searched to understand why this was acting the way it does. IMO, the debugger helps people believe (like me when I ran into this the first time) that there variable inside the foreach loop is new each time because it highlights the type on each iteration.  In foreach(var value in values)  { doSomething(value) } it highlights the "in" first, and the "var value" next helping me think that there was a new value variable declared on each iteration.  Had it just highlighted "value in values" instead I might have realized there was only one variable being assigned..

  • Anonymous
    November 12, 2009
    The comment has been removed

  • Anonymous
    November 12, 2009
    My vote: make the change. Which change: the concept of foreach from <changing a read-only variable> to <multiple variables with the same name> I seriously doubt there is any (working) code that depends on lambdas binding to a foreach variable. The only possible usage I can think of is to (repeatedly) bind to the variable and then have some break condition. Initially, I was opposed to the change ("closures close over variables, not values" is just one of the learning points of C#, and every language has them no matter how well-designed). However, I have to agree with the "each" wording argument. It just seems more natural to work the other way.

  • Anonymous
    November 12, 2009
    Make the compiler spit out a warning.  I see this way too often during code reviews.  It's too easy to miss.  Then anyone can compile with warnings-as-errors and easily catch them... though I like the idea of MS buying Jetbrains too. :)  ReSharper is a great product!

  • Anonymous
    November 12, 2009
    This more like a programmer issue than a compiler issue. I would rather vote for the warning from a compiler than changing it. Good experienced programmer would catch it, but again the warnings should be enough for someone to pay attention to and fix it. The only danger is that, most ignore the warnings and expect the result as it seems from the code. Still i think compiler warnings is the way to go for now. Please don't change the way it works. Thanks,

  • Anonymous
    November 12, 2009
    Don't change this. Javascript handles its closures the exact same way, and that is the proper way. Again, just to be clear, I am not proposing that we change how closures are handled. Closures close over variables, great. What I'm proposing is that we change the loop variable of the foreach loop to go logically insidethe body of the loop. -- Eric

  • Anonymous
    November 13, 2009
    After reading even more great comments, my opinion has changed yet again.  I definitely don't like the idea of a warning message.  After Frank Bakker's comment, I realized there would be too many valid examples of "closure over a foreach loop variable" where the closure never leaves the foreach loop.  As some others have suggested I think new C# context keywords are the best answer. I suggest 3 versions of foreach: foreach (var item in collection) -- error on closure of item (this is now deprecated old-style) foreach (var item inside collection) -- item is inside body and is instantiated once per loop (closure allowed) foreach (var item outside collection) -- item is outside body and is instantiated once (closure allowed) There would need to be an easy way to convert older "in" versions of foreach to a new (inside/outside) version with the default being inside.  However the older "in" version of foreach would still be accepted except when using closure.

  • Anonymous
    November 13, 2009
    I find it interesting that some people claim that "good understanding of lambdas and closures, this code does exactly what you would expect it to do." The folks who wrote the Common Lisp standard, who presumably understood both those things, still saw fit to include in the documentation of DOTIMES and related looping constructs: "It is implementation-dependent whether dotimes establishes a new binding of var on each iteration or whether it establishes a binding for var once at the beginning and then assigns it on any subsequent iterations." Which is of course the question you face. They chose not to choose because they wanted to leave implementations some latitude in how they compiled these constructs. But if you expect people to frequently close over the loop variable you might choose to choose: say whether it's one binding or one per iteration. But it doesn't seem to me that the choice is a logical consequence of the nature of closures.

  • Anonymous
    November 13, 2009
    Make it an error in C# 4.0 (and provide an explanation how to achieve both behaviors) Make it behave as expected in C# 5.0 Silently going from one behavior to another is bad. Doing it this way gives a hope that there wouldn't be any single person in existence who not only depends on weird behavior, but also  would manage to jump from exactly C# 3.0 over 4.0 to 5.0 or further. By the way, "int item; foreach (item in data) ..." is an error already. So much for for compatibility.

  • Anonymous
    November 13, 2009
    +1 for warning message (which could always be turned off with #pragma warning of course). it is clearly a problem but a breaking change is a more of a problem.

  • Anonymous
    November 13, 2009
    >No, you're right. Usually its a failure to understand that lambdas close over variables, not values. But who cares I would say that I care because you're proposing to fix a symptom instead of a problem which leaves you with a sort of patchwork of fixes that aren't worth the 100 points to get you back into positive territory and certainly not enough points to cancel out whatever the penalty is for introducing a breaking change.  There's also a "give a mouse a cookie" argument in here somewhere too.  Once you fix this, then something else will be the thing that causes confusion with lambdas and you'll need to fix THAT too because now things are even more inconsistant than they were when you started. That said, I cannot think of a scenario where you would rely on the "old" behavior so my backwards compatibly demagoguery may not apply.

  • Anonymous
    November 13, 2009
    Some food for thought: Imports System.Collections.Generic Module Program    Sub Main        Dim fs As New List(Of Func(Of Integer))        For Each x In Enumerable.Range(0, 10)            fs.Add(Function() x)        Next        For Each f In fs            Console.WriteLine(f())        Next    End Sub End Module C:Temptest.vb(8) : warning BC42324: Using the iteration variable in a lambda expression may have unexpected results.  Instead, create a local variable within the loop and assign it the value of the iteration variable.            fs.Add(Function() x)

  • Anonymous
    November 13, 2009
    Here's my broad suggestion: Leave the existing syntax as is, but add a warning in the same way VB does (yes, even though the lambda doesn't always outlive the iteration). Introduce a new syntax which unifies foreach and LINQ:    foreach (from x in xs) { ... } The expression inside () can be any LINQ comprehension that is allowed today, but without the final "select". All range variables that would be visible in "select" are visible in the body of foreach, and bound for every iteration (consistent with plain LINQ). So, for example, in:    foreach (from x in xs group x by x%3 into xg where xg.Count() > 1) { ... } the body of foreach would be able to reference xg, but not x. This has the added beneficial effect of simplifying the existing pattern of iterating directly over simple queries. You'd think you can at least write:   foreach (var x in from x in xs where x > 0 select x) { ... } but this won't work because two "x" will clash, so you need to come up with "x1" or other similarly descriptive name for one of them. Getting back to the original problem, it's also why I also propose to make a warning for lambda capturing a variable in old-style foreach(var) loop - any existing code that does it for legitimate reasons (e.g. because lambda doesn't escape) can be trivially rewritten to use foreach(from). Thoughts?

  • Anonymous
    November 13, 2009
    Yes, please change it. This problem caught me out. The 'each' in the keyword 'foreach' led me to believe I'd get a new variable for each element in the collection. It turns out that the name of that keyword is currently a lie. When I fixed the bug this problem caused I felt I had to put lots of comments next to the code to prevent any of my colleagues from trying to helpfully clean up the code by removing the unnecessary looking variable I had to declare to fix the problem. The only reason not to change is if a significant number of people can cite existing code bases that would break if the change was made. If people do have code that would break, then we should at least have a warning. Keep up the good work. Regards,   Scott

  • Anonymous
    November 13, 2009
    I would like a syntax that forces the evaluation of a variable at the time of the lamda's declaration.  It would save the necessity of creating dummy variables (as in your example) and have the secondary effect of allowing the compiler to produce better code.

  • Anonymous
    November 13, 2009
    The comment has been removed

  • Anonymous
    November 13, 2009
    The bugs caused by this common misunderstanding of closures are very easy to catch. I do agree with most of Pavel Minaev's points, but I just don't care strongly whether the foreach expansion is changed. I do feel strongly that something should be done to remedy the situation. For me, the biggest win that closures provide in C# is making my code clearer by eliminating noise like the v2 local var. If you do not change the foreach expansion, please add the warning and add a mechanism to avoid the inner variable declaration. It could be something like this: foreach(var v in values)
    {
       funcs.Add( ()=>valueof(v) );
    } In this case you could use:    values.ForEach( v => funcs.Add( () => v ) ); Why isn't .ForEach() part of IEnumerable?   See http://blogs.msdn.com/ericlippert/archive/2009/05/18/foreach-vs-foreach.aspx. -- Eric

  • Anonymous
    November 13, 2009
    I want to emphasize: it should be an error rather than a warning. Because for someone who made this error mistakenly, it's an error. And someone who wrote that code deliberately (for whatever unimaginable reason) better rewrite it as    int item_global;    foreach (int item in items) {        item_global = item;        /* do whatever you want now */    } Really, making it a warning makes no sense at all. Even people porting legacy code are better served with an error.

  • Anonymous
    November 13, 2009
    Change it so new developers don't get bitten. Or at least start giving warnings like VB seems to do according to Pavel Minaev [MSFT]

  • Anonymous
    November 13, 2009
    @TheCPUWizard -  In C++ the for loop variable did exist past the end of the block in most implementations up until just before the original C++ standard was approved in 1998 - several months after VC6 was released.  VC7.1 introduced the standard-compliant behavior as an option, with VC8 changing to the standard-compliant behavior as the default and the pre-standard behavior as the option.

  • Anonymous
    November 13, 2009
    For the record, after reading many of these reponses, I think the right thing to do would be to issue a warning in the nearest release possible and change the behavior in the release after that.   The current behavior, while consistent (and intuitive if you understand enough of the details) is UNuseful - it's likely that any code today that's relying on the current behavior is already broken.  

  • Anonymous
    November 13, 2009
    Perhaps because I come from a FP background, the behavior of the foreach variable surprised me. I have made this mistake several times! You never see the variable being mutated explicitly, and indeed you're not allowed to change it, so you don't realize you're capturing the variable. I think the behavior should be changed so that the variable was bound freshly for each loop iteration.

  • Anonymous
    November 13, 2009
    These weird semantics are the first thing I noticed when I first saw code samples for the then-new closures in C# 3.0.  They go against all of the functional programming semantics that I've ever experienced (OCaml, Haskell, F#, Scheme, ...) and I find it very unnatural.  Closing on a variable reference is a state-heavy concept in the first place, it's contrary to the aspirations of functional programming, and closures are meant to fulfill these aspirations. I think it's still time to change.

  • Anonymous
    November 13, 2009
    @Pavel - I love your foreach(from x in xs) solution for LINQ syntax, but not really for regular foreach syntax.  It just seems too much of a subtle difference in syntax that does not make the difference obvious.  When there are subtle differences in meaning that has caused a lot of problems in the past, I think it is best to be as blunt and obvious as possible in syntax.  That is why I suggested the 3 foreach syntaxes: foreach (var x in xs) -- error or warning (if people insisted) on closure of x foreach (var x inside xs) -- x is inside body and is instantiated once per loop (closure allowed) foreach (var x outside xs) -- x is outside body and is instantiated once (closure allowed) However, if your foreach(from) syntax was the solution chosen, I would not complain ... it would be much better than the current state.

  • Anonymous
    November 13, 2009
    Before reading Frank Bakker's comment about valid examples I thought a warning would be nice. But closing over the loop variable inside the loop is quite common, so either make the change or don't do anything (don't add a warning).

  • Anonymous
    November 13, 2009
    The comment has been removed

  • Anonymous
    November 13, 2009
    Another potentially interesting data point is that Python also mutates range variable (and its lambdas consequently capture it in the same way C# does that):    fs = []    for x in range(0, 3):        fs.append(lambda: x)    for f in fs:        print f() Output:    2    2    2 But so far Python is the only other language (apart from C# and VB) in which I can observe this. Furthermore, in Python, it is at least consistent with its list comprehensions:    fs = [(lambda: x) for x in range(0, 3)]    for f in fs:        print f() Also outputs:    2    2    2 And Python lets you change range variable in "for" (it will simply be overwritten on the next iteration), and doesn't scope it to the loop body:    for x in range(0, 3):        pass    print x Output:    2 So at least its behavior, while still (IMO) unintuitive, is at least fully self-consistent: a range variable really is just a repeatedly assigned variable, nothing more. There's nothing magical about it - it's not "semi-readonly", it's not specially scoped, etc.

  • Anonymous
    November 13, 2009
    I vote for a compiler warning, but I'm very interested in the issues around adding compiler warnings.

  • Anonymous
    November 13, 2009
    The comment has been removed

  • Anonymous
    November 13, 2009
    That's why I like resharper. Resharper kindly warns stupid developers like me about access to modified closure and offers refactorings like introducing a variable as your example states. Daniel

  • Anonymous
    November 14, 2009
    I vote for behaviour change in C#5. I vote for a warning in C#4. I think its too late to change the behaviour in C#4, but C#4 should definitely give a warning. This warning should be on by default in C#4 and C#5.  The warning could be off by default in C#6 and later. In C#5 the behaviour should be changed to the new, expected way. For those cases, where the old behaviour is still needed, I like the syntax Joe suggested above: SomeType item; foreach (item in collection) { .. } // note: no var before item, so the variable outside the loop is used. However, I have failed so far to find a situation that needs the old behaviour ... In my humble opinion, the outside expansion // item var OUTSIDE loop try {  int m;  while (e.MoveNext ()) {    m = (int) (int) e.Current;    funcs.Add (() => m);  } } is just bad code, because it conflicts with a rule I consider good: Declare/initialize/allcote a variable as late as possible with the smallest scope possible.   Proof that this rule is good: Pulling variable m out of the loop body for no apparent reason introduces weird and unexpected behaviour ... ;-P Therefore, I think the inside expansion // item var INSIDE loop try {  while (e.MoveNext ()) {    int m = (int) (int) e.Current;    funcs.Add (() => m);  } } is better code.  I think thats the code that the majority of prgrammers would wrote by hand if they had to replace foreach. To guesstimate the pain of this breaking change, it might make sense to feed all NET source plus codeplex and then something through a modified c#4 compiler version ..

  • Anonymous
    November 14, 2009
    I say to give the warning only where it would make a difference. That is, if you close over a loop variable and that closure escapes the loop, give a warning. In theory the only thing it should break is code that gets compiled with warnings as errors, and has closures over escaping loop variables.

  • Anonymous
    November 14, 2009
    > I say to give the warning only where it would make a difference. That is, if you close over a loop variable and that closure escapes the loop, give a warning. The problem, as pointed out earlier, is that a closure may not actually escape the loop, and there's no way the compiler can find out. Consider some hypothetical L2S code:    foreach (var customer in ctx.Customers) {        var orders = ctx.Orders.Where(o => o.Customer == customer).ToList();        ...      } Now we have a lambda that captures "customer". It doesn't really escape here, because the enumerator in which it is stored is immediately discarded, and cannot be referenced outside the loop. But how is the compiler supposed to know that? So it would have to give a warning for the above, perfectly reasonable, code, and also the equivalent LINQ comprehension.

  • Anonymous
    November 14, 2009

  • for compiler warning. Anyone who stumbles across this issue once, and realizes the mistake - won't do it again. It's not worth introducing breaking changes, and making the language inconsistent. The compiler warning should suffice for all matters.
  • Anonymous
    November 14, 2009
    I would prefer that the current foreach semantics not be changed, but rather add another use for the 'new' keyword such that - foreach (new Object in Collection)...
  • implements the altered semantics of relocating the declaration.
  • Anonymous
    November 14, 2009
    +1 for a compiler warning, I got caught by this pitfall once and gave the weirdest program behaviour ever. Once I found the cause I immediately knew the mistake I'd made, but it is such a subtle semantic one it was difficult to find when reading my code. I think a compiler warning for it is a must. It would have saved me a lot of time tracking down the issue.

  • Anonymous
    November 14, 2009
    I give a vote (+1) for the warning. I have struggled with this a couple of times but not enough to want you guys to change the way the compiler behaves.

  • Anonymous
    November 15, 2009
    Make the breaking change.  Some seem to think a compiler warning is "more acceptable" than a breaking change, but in reality it just gives many of the negatives of breaking changes without any of the benefits of the actual change.  Many companies set "warnings as errors" making it the equivalent of an error, without the benefit of being able to use the corrected behavior. This change would be very beneficial in terms of making the language more closely follow the "principle of least astonishment."  Some have said that once they "discover" the behavior, surely it makes perfect sense.  However, I think something cannot possibly make "perfect sense" yet have to be discovered after getting it wrong.  I think "natural" and "intuitive" should take precedence here. Finally, as Pavel has mentioned repeatedly, making the change would make the foreach construct more consistent with Linq's "from item in items".

  • Anonymous
    November 15, 2009
    I vote for making the change. I mostly code in VB so at least I get a warning but still, it's annoying and counter-intuitive to have to create a local variable for this. If you decide to not make the change, at the very least add the warning in C#. If there wasn't a warning for this in VB I would have been bitten by this several times.

  • Anonymous
    November 15, 2009
    The comment has been removed

  • Anonymous
    November 15, 2009
    m = (int)(int)e.Current; Why the double cast?

  • Anonymous
    November 15, 2009
    The comment has been removed

  • Anonymous
    November 15, 2009
    Nice analysis, this is a classic trap to fall into.

  • Anonymous
    November 15, 2009
    The way you described how the foreach variable should be scoped inside the foreach loop, seems the right way to be. But, I'm afraid this has allot of potential to introduce multi targeting bugs, code written with the new foreach rules will not work correctly when re targeted to older compilers (C# 2.0, C# 3.0). There are allot of libraries (we have 3 internal libraries for Desktop and CF version of .Net and 1 for Desktop and Silverlight) that are multi targeted for various versions of .Net framework (Desktop, Compact Framework or Silverlight). If this change is put in the new compiler there will be a new multi targeting gotcha, instead of a lambda over foreach variable gotcha. I don't think there is a real use for the way closures work right now with foreach, and that there is someone who depends on the feature, but I don't think that's the real issue, I think the real issue is multi targeting.

  • Anonymous
    November 16, 2009
    While we're on about counter intuative things with the foreach loop how about the fact that            List<Animal> AnimalList = new List<Animal>();
               AnimalList.Add(new Dog());
               AnimalList.Add(new Cat());
               foreach (Dog d in AnimalList)
                   d.Bark(); Produces an InvalidCastException instead of barking the dog but not the cat?  After all, I said for each dog in the animal list.  If I wanted to try to bark every animal I would've written            List<Animal> AnimalList = new List<Animal>();
               AnimalList.Add(new Dog());
               AnimalList.Add(new Cat());
               foreach (Animal a in AnimalList) The foreach loop was designed for a language without generics. In the C# 1 world, most of the time you're iterating over a sequence of objects that all happen to be of the same type, but all you know about them at compile time is that they're "object". Now, suppose you are iterating over what you think is a sequence of dogs, but a cat got in there by accident. Remember, we don't know that it's a list of animals, we know that its a sequence of objects and that's all. Is the right thing to do to ignore the error and keep on trucking, or to inform the developer that their assumptions are wrong, and this in fact is not a list of dogs? C# is not a "muddle on through and hope for the best" language. Rather, a design point of C# is that being brittle is better than being lax. A lax compiler hides your bugs, and we do not want to hide your bugs, we want to bring them to your attention. If what you want to do is iterate through only the members of a particular sequence that are of a particular type, we have an extension method for that:  foreach (Dog d in AnimalList.OfType<Dog>()) -- Eric

  • Anonymous
    November 16, 2009
    Yeah, you've got me there.  The two situations aren't really analogous at all.  I bow to your superior reasoning.  Although AnimalList.OfType<Dog> still requires extra knowledge about the problem, just like the proposed foreach(new dog d in AnimalList) syntax, I'd say that gets outweighed by the risk of compiling code that could be silently broken.

  • Anonymous
    November 16, 2009
    @P.Baughman - one way to avoid your problem is to always use var for the loop variable in foreach. This has the effect of neatly disabling the casting behaviour; then as Eric says you can also use OfType to filter the list.

  • Anonymous
    November 16, 2009
    @Pavel Thank you! That clears it up :)

  • Anonymous
    November 16, 2009
    My vote: change this! I agree mostly with Stuart Ballard and Pavel Minaev.

  • Anonymous
    November 16, 2009
    The comment has been removed

  • Anonymous
    November 16, 2009
    I think is better a warning because that's the purpose of a waring, it isn't?. Also it's important for backward compatibility.

  • Anonymous
    November 16, 2009
    The comment has been removed

  • Anonymous
    November 17, 2009
    Eamon: It's  unlikely that anybody knowingly relies on the current behavior, but it's quite likely that many people unknowningly rely on the behavior. You can easily write code to rely on any obscure behavior, and never realize it until the behavior changes.

  • Anonymous
    November 17, 2009
    I vote for the change. I don't think than the behavior of the foreach should match the behavior of the for because it is a higher-level language construct. It feels more natural to me that foreach(var x in y) is implemented as: for(int i = 0; i < y.Count; i++) {  const var x = y[i];  // your foreach body goes here } (actually it is the way I implement it once as a macro in C code). And I think this even can enable more agressive optimization of simple foreach body: use IEnumerable.Current value instead of assigning it first to a variable, for example in loop like this: int sum = 0; foreach(int x in y) sum += y;

  • Anonymous
    November 18, 2009
    > And I think this even can enable more agressive optimization of simple foreach body: use IEnumerable.Current value instead of assigning it first to a variable Why do you think it's an optimization?

  • Anonymous
    November 19, 2009
    From a Functional Programmer's perspective: make the change.  Wait, that's what Sebastian Good said!  Perhaps our minds have been warped by excessive exposure to Matthias Felleisen. Final comment: grep for the word "clearly". These (in general) represent points where the author is not convinced of his own argument. Pavel +1 Sebastian +1

  • Anonymous
    November 24, 2009
    The comment has been removed

  • Anonymous
    November 29, 2009
    The comment has been removed

  • Anonymous
    December 02, 2009
    According to the article, int acts as a reference type, not a variable type. Where do I make a mistake? I do not understand the question. Ints are of value type, not of reference type. I do not know what "variable type" means. Can you rephrase the question to explain what it is that you do not understand? -- Eric

  • Anonymous
    December 08, 2009
    According to the article, int acts as a reference type, not a variable type. Where do I make a mistake? [Rephrased]: In C# all instances of all classes are of reference types. All instances of structs are simple "variable" types. So if two references are pointing to one object and you will make some changes in this object via first reference, changes will also be reflected if you examine this object via second reference. This is of course not true for simple types, like int. Sorry for not being clear at the beginning. You seem to be confusing variables with value types. There's no such thing as a "variable type". A variable HAS a type, which can be a reference type or a value type. Either way, a variable is a storage location which can store a value or a reference, depending on its type. Closures close over variables, not values. You can of course change the value stored in a variable; that's what "variable" means: that the value can vary. -- Eric

  • Anonymous
    December 22, 2009
    C# Reference clearly states that each element in the loop should be treated as a separate entity: "The foreach statement repeats a group of embedded statements for each element in an array or an object collection. The foreach statement is used to iterate through the collection to get the desired information, but should not be used to change the contents of the collection to avoid unpredictable side effects." As was repeatedly stated above: The syntax "foreach(var x in Y)" basically conveys that you pick out objects "in" the bucket Y and name each object x - not the you place those Y objects in a separate container x. Current implementation is not conforming to the above and so should be fixed.

  • Anonymous
    January 12, 2010
    Issue a warning in C#4, and change it in C#5.

  • Anonymous
    January 22, 2010
    @Konrad: "I much prefer Java’s rule here: only allow closing over variables which are declared final. But then, this only works because Java allows local variables to be declared final, which C# doesn’t." The reason Java supports final variables is exactly to fix this problem.  The designer of inner classes was a Lisp refugee who had been bitten by the DOTIMES issue (see Peter Seibel above), and didn't want to allow the problem into Java. This particular barn door is hard to close after the mutable iteration variable has galloped away.

  • Anonymous
    February 12, 2010
    I vote for making the change. I think adding a warning or an error is the worst possible thing to do - I hate it when the message I get from a system is "you did this wrong, we know what you did wrong, we would rather tell you then do the right thing for you." - that's not the kind of system I enjoy using, be it programming language or business application. This one may go even further - "we tricked you into doing the wrong thing, we know what you did was wrong, but we'd rather tell you about it than do the right thing for you." All kinds of annoyance there. The VB error message is exactly that. See Alan Cooper for how error messages should be (mainly, don't have them).

  • Anonymous
    February 19, 2010
    Python allows both ways: >>> f = [] >>> for i in range(10): f.append(lambda : i) >>> f0 9 On the other hand: >>> f = [] >>> for i in range(10): f.append(lambda v = i: v) >>> f0 0 I've called this "v" for clarity. Of course, this also works: >>> f = [] >>> for i in range(10): f.append(lambda i = i: i) >>> f0 0

  • Anonymous
    April 13, 2011
    I say change it since few people used it knowingly. Other languages should follow suit too in the same VS release. The warning is too often overlooked by many peoples.

  • Anonymous
    August 19, 2011
    Thanks for the clear explanation!

  • Anonymous
    September 08, 2011
    Something has got to change.  Otherwise, the StackOverflow developers are going to have to create a new site called closeoverloopvariable.stackexchange.com dedicated specifically to this problem.  It has to be one of the most asked questions on SO.  I bet there are literally thousands of them lurking there already.

  • Anonymous
    September 08, 2011
    Yes, please, Eric, consider changing the scope of the foreach variable so that it isn't shared between iterations. C# is generally fantastic at doing just the right thing, without introducing ambiguity and without the developers even fully understanding just what really happens behind the scenes. The foreach variable capturing is an unfortunate example of where this doesn't happen.

  • Anonymous
    March 20, 2012
    I just saw it claimed elsewhere that this change would be made in C# 5, and I came here to see if it were true.  It is!  Hooray!  I shall bid a not-so-fond farewell to the bugs this used to lead to.

  • Anonymous
    April 19, 2012
    How do you solve this? //--- var actions = new List<Action>(); for ( int counter = 0; counter < 10; ++counter ) {    actions.Add ( () => Console.WriteLine(counter) ); } foreach (var action in actions) {    action(); } //--- The range-based-for (foreach) sugar is wrong, but you need a capture-by-value lambdas. Especially in a language that tries to support value semantics.