Udostępnij za pośrednictwem


“foreach” vs “ForEach”

A number of people have asked me why there is no Microsoft-provided “ForEach” sequence operator extension method. The List<T> class has such a method already of course, but there’s no reason why such a method could not be created as an extension method for all sequences. It’s practically a one-liner:

public static void ForEach<T>(this IEnumerable<T> sequence, Action<T> action)
{
  // argument null checking omitted
  foreach(T item in sequence) action(item);
}

My usual response to “why is feature X not implemented?” is that of course all features are unimplemented until someone designs, implements, tests, documents and ships the feature, and no one has yet spent the money to do so. And yes, though I have famously pointed out that even small features can have large costs, this one really is dead easy, obviously correct, easy to test, and easy to document. Cost is always a factor of course, but the costs for this one really are quite small.

Of course, that cuts the other way too. If it is so cheap and easy, then you can do it yourself if you need it. And really what matters is not the low cost, but rather the net benefit. As we’ll see, I think the benefits are also very small, and therefore the net benefit might in fact be negative. But we can go a bit deeper here. I am philosophically opposed to providing such a method, for two reasons.

The first reason is that doing so violates the functional programming principles that all the other sequence operators are based upon. Clearly the sole purpose of a call to this method is to cause side effects. The purpose of an expression is to compute a value, not to cause a side effect. The purpose of a statement is to cause a side effect. The call site of this thing would look an awful lot like an expression (though, admittedly, since the method is void-returning, the expression could only be used in a “statement expression” context.) It does not sit well with me to make the one and only sequence operator that is only useful for its side effects.

The second reason is that doing so adds zero new representational power to the language. Doing this lets you rewrite this perfectly clear code:

foreach(Foo foo in foos){ statement involving foo; }

into this code:

foos.ForEach((Foo foo)=>{ statement involving foo; });

which uses almost exactly the same characters in slightly different order. And yet the second version is harder to understand, harder to debug, and introduces closure semantics, thereby potentially changing object lifetimes in subtle ways.

When we provide two subtly different ways to do exactly the same thing, we produce confusion in the industry, we make it harder for people to read each other’s code, and so on. Sometimes the benefit added by having two different textual representations for one operation (like query expressions versus their underlying method call form, or + versus String.Concat) is so huge that it’s worth the potential confusion. But the compelling benefit of query expressions is their readability; this new form of “foreach” is certainly no more readable than the “normal” form and is arguably worse.

If you don’t agree with these philosophical objections and find practical value in this pattern, by all means, go ahead and write this trivial one-liner yourself.

UPDATE: The method in question has been removed from the "Metro" version of the base class library.

Comments

  • Anonymous
    May 18, 2009
    The comment has been removed

  • Anonymous
    May 18, 2009
    The comment has been removed

  • Anonymous
    May 18, 2009
    On the other hand, AsParallel().ForEach is extremely useful :)

  • Anonymous
    May 18, 2009
    Are there any performance differences between foreach and .ForEach? Hussein

  • Anonymous
    May 18, 2009
    Hussein: Of course. In ForEach there is the cost of an extra delegate call for each item. The order is the same - O(n), but the net cost of the ForEach is always higher than that of foreach. That cost is very small though - on almost all cases it would be insignificant compared to the action being called. Also, and this is purely hypothetical, foreach can allow for some compiler optimizations that ForEach would not allow, so the difference can be a bit bigger. That said, micro optimizations here are useless - the idea of ForEach in my mind is simply wrong for normal scenarios. For the example Steve Cooper gave though, I find that it is usually a much needed 'Expression'.

  • Anonymous
    May 18, 2009
    List<T>.ForEach is actually not the same as foreach-statement internally (you can check that with Reflector). Instead, it's a plain for-statement over list indices. The difference? When you use List<T>.Enumerator (as foreach-statement does), it performs a check to see if collection is modified on every iteration on the loop. A plain indexing for-statement won't do that, so it is slightly faster. In practice, though, the overhead of a virtual call via a delegate is higher than that of version checking.

  • Anonymous
    May 18, 2009
    This might be the first extension method I wrote. I don't think the closure piece is important; all extension methods have that. As for side effects, that it's void should make it pretty clear what it's doing. In general, if basically everyone wants it, it should just be part of the platform, for the same reason that so many other friendly, generally useful methods are.

  • Anonymous
    May 18, 2009
    Excellent post. As my response here suggests, I agree with you completely on the extension method issue: http://stackoverflow.com/questions/317874/existing-linq-extension-method-similar-to-parallel-for/318493#318493

  • Anonymous
    May 18, 2009
    Nice, I asked this question on Charlie Calvert's blog a while back. In between times I worked out an answer for myself, which was that the ForEach implicitly 'finalises' (I don't know what the correct term is?) the IEnumerable, and so breaks the functional model. I have preferred this solution, for logging/debug purposes (the name is clearer too):        public static IEnumerable<T> SideEffects<T>(this IEnumerable<T> items, Action<T> pPerfomAction)        {            foreach (T item in items)            {                pPerfomAction(item);                yield return item;            }        }

  • Anonymous
    May 18, 2009
    > When we provide two subtly different ways to do exactly the same thing, we produce confusion in the industry, we make it harder for people to read each other’s code, and so on. Of course, the same disadvantage appears when you leave it to everyone to create their own (almost?) identical method. Having trivial extension methods included in the BCL is not about the cost of creating them, it's all about creating a common language for all .NET developers, something that you can read without looking things up or guessing. (Generally speaking - I understand your specfic reasons for not implementing ForEach. But at the same time I follow Steve Cooper here - ForEach might be a nice thing to have in some situations, from a pragmatic PoV. Whether it should be used in a specific situation is a difficult question, but providing it is only a really bad idea if you believe in the concept of protecting developers from themselves.)

  • Anonymous
    May 18, 2009
    The comment has been removed

  • Anonymous
    May 19, 2009
    Yesterday, I converted bunch of code from using foreach to ForEach. Here's what I had originally: public class MetaDataFixer {        public List<Mp3File> Mp3FileList;        private void CleanupTitle()        {            foreach(Mp3File file in Mp3FileList)                //do clean up of title        }        private void CleanupComments()        {            foreach(Mp3File file in Mp3FileList)                //do clean up of comments        }        //Other similar methods:        private void CleanupAlbum();        private void CleanupArtist();        public void Cleanup()        {            if(_options.CleanupTitle)                CleanupTitle();            if(_options.CleanupComments)                CleanupComments();        } } I converted that in to: public class MetaDataFixer {        public List<Mp3File> Mp3FileList;        private readonly Action<Mp3File> cleanupAlbum =            delegate(Mp3File file) { file.Tag.Album = StringHelper.Cleanup(file.Tag.Album); };        private readonly Action<Mp3File> cleanupTitle =            delegate(Mp3File file) { file.Tag.Title = StringHelper.Cleanup(file.Tag.Title); };        //more clean up methods.        public void Cleanup()        {            List<Action<Mp3File>> actionsToPerform = GetActionsToPerform();            Mp3FileList.ForEach(delegate(string fileName)                        {                            Mp3File file;                            file = Mp3File.Create(fileName);                            actionsToPerform.ForEach(delegate(Action<Mp3File> action) { action(file); });                            file.Save();                         }        }        private List<Action<Mp3File>> GetActionsToPerform()        {            List<Action<Mp3File>> actionsToPerform = new List<Action<Mp3File>>();            if (_options.CleanupAlbum)                actionsToPerform.Add(cleanupAlbum);            if (_options.CleanupArtist)                actionsToPerform.Add(cleanupArtist);            //more options        } } I feel that this code is cleaner than the earlier version. I understand your point about closure being a potential issue with ForEach but if my delegate code is simple, the ForEach looks cleaner. Eric, I would be interested in any comments you may have about this re factoring. RJ

  • Anonymous
    May 19, 2009
    @RJ Create actions that work on a single file list this;    public static CleanupAlbum(Mp3File file)    {      ...    } Then you can execute them like    Mp3FileList.ForEach(CleanupAlbum); As an idea, you might also consider an 'action combiner' -- something like this (not compiled or tested, just for illustration)    public Action<T> Combine<T>(params Action<T>[] actions)    {      return item =>      {        foreach(var actOn in actions)        {          actOn(item);        }      };    } Then you could write;    allCleanupActions = Combine(CleanupAlbum, CleanupArtist, CleanupComments); and    Mp3FileList.ForEach(allCleanupActions);

  • Anonymous
    May 19, 2009
    The comment has been removed

  • Anonymous
    May 19, 2009
    ForEach((x,index) => ...) is useful.

  • Anonymous
    May 19, 2009
    The comment has been removed

  • Anonymous
    May 19, 2009
    The problem with implementing myself (which I have done), is that now EVERYONE implements this. Not very DRY.

  • Anonymous
    May 19, 2009
    Cost vs coolness Aaron is correct in that Linq (i.e., lisp like) expressions are much less readable and maintainable.    Lisp-like constructs (LINQ) are extremely powerful but do not scale well for extremly complex systems.  They work in environments where one focuses on business/logic rules for the system and not on writing code such as generating logic tables and autogenerating code from the tables. Code should be written with a total cost over the expected product lifetime in mind and not just the speed of which the orginal developer progresses.    This is why simpler constructs, algorithms last longer than higher level, trendy, pattern of the week code. The next developer to take over the code should be in your concern when writing or maintaing a system.

  • Anonymous
    May 19, 2009
    The comment has been removed

  • Anonymous
    May 19, 2009
    Ike: That is not correct; extension methods have nothing to do with closures.

  • Anonymous
    May 19, 2009
    The comment has been removed

  • Anonymous
    May 19, 2009
    There is one thing you overlook in your comparison. I also overlook that the method version can take a method group as an argument. -- Eric When you compare ForEach() with foreach, your ForEach() example for some reason includes the type. Which can be implied of course. If you let the compiler infer the type you get nice intellisense advantages too. Sure you can get those with a foreach block too. But I think ".F Tab ( c => c." is a lot easier to write than "foreach (var c in sequence) c." IMHO Intellisense is super important, and shouldn't be left out of the debate. That said it is trivial to add the method. So I just add it whenever I need it. Problem is that I do this in every project because I don't have a library I share across every project... and after a while all that trivial coding starts to add up.

  • Anonymous
    May 19, 2009
    I don't like mixing in IntelliSense in the debate. Sure it's nice but the language shouldn't be designed around some external tool. The tool should be designed around the language.

  • Anonymous
    May 19, 2009
    The comment has been removed

  • Anonymous
    May 19, 2009
    Eric, seems your argument is more pedantic than semantic. Why have the foreach method at all then? For would suffice, the following code would work perfectly well: Perhaps you missed the bit where I said that sometimes the benefit of the improvement in textual representation pays for the cost of having two ways to do something. -- Eric            int[] loop = new[] { 4354, 234, 23, 34, 3, 4 };            var e = loop.GetEnumerator();            for (e.Reset(); e.MoveNext(); )            {                Console.WriteLine(e.Current);            } Or we could just use while and get rid of For as well. Of cource the reason we use foreach is that it is easier. The same goes for the Linq version of ForEach. It makes code simpler and easier to read. loop.ForEach(i => Console.WriteLine(i)); I use this all the time, as I find it very useful, it seem more natural when working on a collection, as I don't need to output a collection and then run a foreach on it. Great. Perhaps you missed the bit at the end where I said that if you find this useful and don't share my objections, you should go ahead and use it. -- Eric Your side-effect argument is also a little weak, can you explain how ToArray() is side-effect free in that case? No, because I don't understand the question. -- Eric I hae a direct link to the iterators here if any one wants a full example: http://code.google.com/p/ximura/source/browse/trunk/Ximura%20Base/Helper/Linq/Linq.cs There are ForEach, ForIndex, and ForBigIndex iterators. Paul Stancer

  • Anonymous
    May 20, 2009
    @Steve Cooper: "You get two interations (sic), but the second is on a subset. Iteration is so fast that programming differently for it would be a premature optimisation." No, there's a single iteration. The first Where looks at the first item.  If it's accepted, "yield return" creates a continuation point and immediately passes control to the second Where. The second Where then makes the same decision, and eventually returns to the continuation point in the first Where, where the second item is processed.  This means we "loop" only once, and pass the results up through the chain of operations, through the use of continuations created by "yield return".  This construct is highly efficient... at least as efficient as you'd code imperatively.

  • Anonymous
    May 20, 2009
    One way to avoid the philosophical objection is to only manipulate sequences of Actions. For example, we can already do this:    var purge = new DirectoryInfo("c:\stuff")                .GetFiles()                .Where(file => (file.Attributes == FileAttributes.Normal))                .Select(file => new Action(file.Delete)); The "purge" is now a list of actions to be executed later. And I can tack on a few actions to send notification emails after the purge:    string[] recipients = new[] {"fred@nowhere.com", "sid@somewhere.com"};    purge = purge.Concat(        recipients.Select(recipient =>            new Action(() => new SmtpClient().Send("do-not-reply", recipient, "Hey!", "Finished purge")))); The only thing we need a new extension method for is this:    Action doPurge = purge.Aggregate(); Which can be implemented thusly:    public static Action Aggregate(this IEnumerable<Action> source)    {        Action combined = null;        foreach (Action action in source)            combined += action;        return combined;    } Just turns the sequence of Actions into a single multi-cast delegate. Note that up to this point, there have been no side-effects whatsoever. We've only been manipulating lists of actions, not actually executing them. So nothing we've done can possibly invalidate that philosophical objection. :) The last thing we do is:    doPurge(); Which is where all the side-effects happen at last, and who could possibly be surprised by side-effects from a statement like that?

  • Anonymous
    May 20, 2009
    Steve, code to print all code lines neglects any error handling.  Try to effectivly determine what error happened and where it happened with your code example is difficult and greatly increases the cost to support and maintain in an ongoing basis. Your example: string[] classNames = rootDir     .GetDirectories()         // all directories, recursively     .GetFiles("*.cs")         // all c# files     .Select(GetFileContent)   // the file content as a string     .SelectMany(content => content.Split('n')) // all the lines     .Where(IsClassDefinition) // class definition lines     .Select(GetClassName)     // class names     .ToArray(); Errors/exceptions that may happen:    - permission error at any one of            - directory            - file    - Memory errors (file too big, too many lines)    - File content errors (binary file with .CS extension)    - Generic .net errors A generic exception.tostring() is not useful for supporting a production quality commercial product. This is what is lost in making calls a.b(w).c(x).d(y).e(z) since any particular call in the chain may fail or (much worse) induce side effects that affect later calls in the chain.  It was used early on in C++ systems but fell out of favor for that reason. The sequential code is much easier to test and maintain if you factor out into a new function the code to open a CS file and print its contents. For compactness, you could even do the same thing in the cmd shell without having to ship a C# application (using FOR, findstr, grep and if adventurous (SED)).

  • Anonymous
    May 20, 2009
    The comment has been removed

  • Anonymous
    May 20, 2009
    The comment has been removed

  • Anonymous
    May 20, 2009
    The comment has been removed

  • Anonymous
    May 21, 2009
    The comment has been removed

  • Anonymous
    May 21, 2009
    The comment has been removed

  • Anonymous
    May 21, 2009
    @Greg I don't think my comments are geared towards making the case for ForEach() because it is "self-documenting". I usually hear that term in the context of developers NOT wanting to write documentation for their methods:) and so it has a negative connotation for me. My argument is based purely on the expressed intent of ForEach, communicated via its naming AND its documented behavior. This behavior is exactly what one would expect if they did sequence.ForEach(..) and so there are no "unintended side effects" vis-a-vis doing foreach(...) { }.... just ovelooked ones.

  • Anonymous
    May 21, 2009
    I don't mind there's no foreach extension but in a lot of the situations where I end up using foreach is for applying a function on all the elements to create a result, (just like sum or average in effect does). An extension method for applying any function to all elements would be nice.

  • Anonymous
    May 21, 2009
    The comment has been removed

  • Anonymous
    May 24, 2009
    Многие люди спрашивают меня – почему Microsoft не предоставила для последовательностей метод расширения

  • Anonymous
    May 25, 2009
    Alg&uacute;n tiempo despues de que diera mi opini&oacute;n sobre un potencial m&eacute;todo extensor

  • Anonymous
    May 26, 2009
    If you don't read Eric Lippert's Fabulous Adventures In Coding , you're really missing out on some great

  • Anonymous
    May 26, 2009
    If you don&#39;t read Eric Lippert&#39;s Fabulous Adventures In Coding , you&#39;re really missing out

  • Anonymous
    May 06, 2010
    Well, the Reactive Extensions team at Microsoft Research chose differently. They included 2 imperative operators in their extensions: Run (same as your ForEach), and Do. Check it out here: http://community.bartdesmet.net/blogs/bart/archive/2009/12/26/more-linq-with-system-interactive-the-ultimate-imperative.aspx  Omer.

  • Anonymous
    July 06, 2010
    Great blog - I generally agree with the post, ForEach is very imperative. One of the only redeeming qualities of it though, is that it means that when I'm writing mostly functional code, the order is consistent: every time I have to use foreach it feels "out of place" compared to the order I've been writing all my other statements in like Select and Zip (as the above poster mentions, I use Rx's "Run" instead). Of course, you could argue that it should feel weird, since foreach is bad :) The other advantage is that you can use Method Groups, so for example: new[] {1,2,3,4,5}.Select(x => x*5).Run(Console.WriteLine)

  • Anonymous
    July 06, 2010
    The Reactive Extensions library provides the Run extension method, which I think is a better name.

  • Anonymous
    April 18, 2011
    This probably isnt directly related to answer but its pretty funny of what I have just found out. ForEach, delete (works) List<int> list = new List<int>(){ 1, 2, 3, 4, 5, 6}; list.ForEach(x => {    Console.WriteLine(x);    list.Remove(x); }); foreach, delete (crashes) // throws exception foreach (var x in list) {    Console.WriteLine(x);    list.Remove(x); } ForEach, insert (...) // goes in infinite loop... list.ForEach(x => {    list.Add(1); }); foreach, insert (crashes) // throws exception foreach (var x in list) {    Console.WriteLine(x);    list.Add(x); } So anyone who is talking here about mutability or different confusing layers etc, I think it is completely half implemented feature by Visual Team because enumeration will always cause problems if the collection is modified. Despite of arguments, I still see a no reason that ForEach should allow modifications, it is purely used for enumeration and it makes no difference whether its foreach(var item in x) syntax or x.ForEach(x=>{}) syntax. I disagree with Eric, I just see it simply that BCL team has implemented this feature on IEnumerable as well as list.ForEach is faulty. "Why" is completely subjective,for example, Silverlight has added complex hashing algorithms and left MD5 behind where else everywhere else we use MD5 so widely. Its more of how much anything is in demand and who is the one who chooses whether to include it or not in framework. There is no logical or philosophical reason at all for not having ForEach in IEnumerable. There are many such missing points which I think .NET will improve over time.

  • Anonymous
    June 14, 2011
    Also the List<T>.ForEach method is uselessly slower than the foreach(...) equivalent (since under the hood it calls a delegate as pure overhead). The key is the "uselessly" word. If there was any gain (like in readability, conveying semantics, whatever), worrying about it would be premature optimization. But we have code which is already harder to read, semantically misleading, causing side-effects... and it also is slower! What worries me is that people are calling this a functional approach and praising it for being easier parallelize (when it's not and should not be parallelized) and for being easier to optimize (when in reality it's just slower).