Sdílet prostřednictvím


Arrays considered somewhat harmful

I got a moral question from an author of programming language textbooks the other day requesting my opinions on whether or not beginner programmers should be taught how to use arrays.

Rather than actually answer that question, I gave him a long list of my opinions about arrays, how I use arrays, how we expect arrays to be used in the future, and so on. This gets a bit long, but like Pascal, I didn't have time to make it shorter. 

Let me start by saying when you definitely should not use arrays, and then wax more philosophical about the future of modern programming and the role of the array in the coming world.

You probably should not return an array as the value of a public method or property, particularly when the information content of the array is logically immutable. Let me give you an example of where we got that horridly wrong in a very visible way in the framework.  If you take a look at the documentation for System.Type, you'll find that just looking at the method descriptions gives one a sense of existential dread. One sees a whole lot of sentences like "Returns an array of Type objects that represent the constraints on the current generic type parameter." Almost every method on System.Type returns an array it seems. 

Now think about how that must be implemented. When you call, say, GetConstructors() on typeof(string), the implementation cannot possibly do this, as sensible as it seems.

public class Type {
private ConstructorInfo[] ctorInfos;
public ConstructorInfo[] GetConstructors()
{
if (ctorInfos == null) ctorInfos = GoGetConstructorInfosFromMetadata();
return ctorInfos;
}

Why? Because now the caller can take that array and replace the contents of it with whatever they please. Returning an array means that you have to make a fresh copy of the array every time you return it. You get called a hundred times, you’d better make a hundred array instances, no matter how large they are. It’s a performance nightmare – particularly if, like me, you are considering using reflection to build a compiler. Do you have any idea how many times a second I try to get type information out of reflection?  Not nearly as many times as I could; every time I do it’s another freakin’ array allocation!

The frameworks designers were not foolish people; unfortunately, we did not have generic types in .NET 1.0. clearly the sensible thing now for GetConstructors() to return is IList<ConstructorInfo>. You can build yourself a nice read-only collection object once, and then just pass out references to it as much as you want.

What is the root cause of this malaise? It is simple to state: The caller is requesting values.  The callee fulfills the request by handing back variables.

An array is a collection of variables. The caller doesn’t want variables, but it’ll take them if that’s the only way to get the values. But in this case, as in most cases, neither the callee nor the caller wants those variables to ever vary. Why on earth is the callee passing back variables then? Variables vary. Therefore, a fresh, different variable must be passed back every time, so that if it does vary, nothing bad happens to anyone else who has requested the same values.

If you are writing such an API, wrap the array in a ReadOnlyCollection<T> and return an IEnumerable<T> or an IList<T> or something, but not an array.  (And of course, do not simply cast the array to IEnumerable<T> and think you’re done!  That is still passing out variables; the caller can simply cast back to array!  Only pass out an array if it is wrapped up by a read-only object.)

That’s the situation at present. What are the implications of array characteristics for the future of programming and programming languages?

Parallelism Problems

The physics aspects of Moore’s so-called “Law” are failing, as they eventually must. Clock speeds have stopped increasing, transistor density has stopped increasing. The laws of thermodynamics and the Uncertainty Principle are seeing to that. But manufacturing costs per chip are still falling, which means that our only hope of Moore’s "Law" continuing to hold over the coming decades is to cram more and more processors into each box. 

We’re going to need programming languages that allow mere mortals to write code that is parallelizable to multiple cores.

Side-effecting change is the enemy of parallelization. Parallelizing in a world with observable side effects means locks, and locks means choosing between implementing lock ordering and dealing with random crashes or deadlocks. Lock ordering requires global knowledge of the program. Programs are becoming increasingly complex, to the point where one person cannot reasonably and confidently have global knowledge. Indeed, we prefer programming languages to have the property that programs in them can be understood by understanding one part at a time, not having to swallow the whole thing in one gulp.

Therefore we tools providers need to create ways for people to program effectively without causing observable side effects.

Of all the sort of “basic” types, arrays most strongly work against this goal. An array’s whole purpose is to be a mass of mutable state. Mutable state is hard for both humans and compilers to reason about. It will be hard for us to write compilers in the future that generate performant multi-core programs if developers use a lot of arrays.

Now, one might reasonably point out that List<T> is a mass of mutable state too. But at least one could create a threadsafe list class, or an immutable list class, or a list class that has transactional integrity, or uses some form of isolation or whatever. We have an extensibility model for lists because lists are classes. We have no ability to make an “immutable array”. Arrays are what they are and they’re never going to change.

Conceptual Problems

We want C# to be a language in which one can draw a line between code that implements a mechanism and code that implements a policy.

The “C” programming language is all about mechanisms. It lays bare almost exactly what the processor is actually doing, providing only the thinnest abstraction over the memory model. And though we want you to be able to write programs like that in C#, most of the time people should be writing code in the “policy” realm. That is, code that emphasizes what the code is supposed to do, not how it does it.

Coding which is more declarative than imperative, coding which avoids side effects, coding which emphasizes algorithms and purposes over mechanisms, that kind of coding is the future in a world of parallelism. (And you’ll note that LINQ is designed to be declarative, strongly abstract away from mechanisms, and be free of side effects.)

Arrays work against all of these factors. Arrays demand imperative code, arrays are all about side effects, arrays make you write code which emphasizes how the code works, not what the code is doing or why it is doing it. Arrays make optimizing for things like “swapping two values” easy, but destroy the larger ability to optimize for parallelism.

Practical Problems

And finally, given that arrays are mutable by design, the way an array restricts that mutability is deeply weird. All the contents of the collection are mutable, but the size is fixed.  What is up with that? Does that solve a problem anyone actually has?

For this reason alone I do almost no programming with arrays anymore. Arrays simply do not model any problem that I have at all well – I rarely need a collection which has the rather contradictory properties of being completely mutable, and at the same time, fixed in size. If I want to mutate a collection it is almost always to add something to it or remove something from it, not to change what value an index maps to.

We have a class or interface for everything I need. If I need a sequence I’ll use IEnumerable<T>, if I need a mapping from contiguous numbers to data I’ll use a List<T>, if I need a mapping across arbitrary data I’ll use a Dictionary<K,V>, if I need a set I’ll use a HashSet<T>. I simply don’t need arrays for anything, so I almost never use them. They don’t solve a problem I have better than the other tools at my disposal.

Pedagogic Problems

It is important that beginning programmers understand arrays; it is an important and widely used concept. But it is also important to me that they understand the weaknesses and shortcomings of arrays. In almost every case, there is a better tool to use than an array.

The difficulty is, pedagogically, that it is hard to discuss the merits of those tools without already having down concepts like classes, interfaces, generics, asymptotic performance, query expressions, and so on. It’s a hard problem for the writer and for the teacher. Fortunately, for me, it's not a problem that I personally have to solve.

Comments

  • Anonymous
    September 22, 2008
    Very interesting point of view! Your arguments against arrays sound very strong and I personally agree with them. It will be interesting if there is anyone who can advocate poor arrays :)

  • Anonymous
    September 22, 2008
    Instead of  GetConstructors() returning IList<ConstructorInfo>, I would prefer the framework actually define a readonly list interface such is IReadOnlyList<T> and return an instance of that.  Returning IList<T> for a readonly list is a bad idea because you're not actually returning an IList<T>.  You're returning an object that is kind of an IList<T> since it can't fulfull several of the methods.  

  • Anonymous
    September 22, 2008
    Though probably not what want. IEnumerable<T> is effectively a read-only list.

  • Anonymous
    September 22, 2008
    How strange, I was tackling this very issue myself just this afternoon. I wanted to use the array because the code DOM serializer can persist them in nice ways, however, I was aware of the mutable nature of their contents. I managed to solve my serializer problem by having a constructor take IEnumerable<string>, provide the array to consumers as a read-only IList<string> property, and have my type converter wrap the IList in a List<string> and use ToArray with an instance descriptor to my IEnumerable<string> constructor. This way the code DOM serializer persists nicely (rather than persist an IEnumerable which it insists on handling as a resource blob), but my constructor supports more collections and my array contents are immutable for the lifetime of the constructed object. Of course, I could've worked on writing a nicer serializer, but this route was simpler. Having been caught out by arrays in many of the ways you mention, I whole-heartedly concur with you.

  • Anonymous
    September 22, 2008
    @Jonathan, It's more of a sequence than a list though.  List's have several properties that set them apart from sequences.  Namely O(1) random access and O(1) size calculation.  In this situation we're converting from an array to a new data structure.  It seems more natural to go with a list since we already have all of the elements grouped together. Then again, if reflection could be done more efficiently "on demand" then a sequence would potentially be better.  

  • Anonymous
    September 22, 2008
    The comment has been removed

  • Anonymous
    September 22, 2008
    arrays are good for image processing and science applications: each cell represents a physical object or an atom or a pixel. for images, the size really is fixed but the pixels change. that said, I don't know of other places where they are a good model, and your article is generally spot on.

  • Anonymous
    September 22, 2008
    The problem with returning arrays from methods is actually a subset of a more general problem: returning references to private data. The very same argument you make against returning arrays can also be made against returning another object. Also, I don't think that it is bad to return an array casted to IEnumerable<T>. You argue that the caller can simply cast it back -- true. But even if we wrap it in another object, the caller can still access it; if not through reflection, then through unsafe memory operations or whatever. If the caller insists on breaking your code, he will be able to do so regardless.

  • Anonymous
    September 22, 2008
    I like arrays for some things. For example, String.Split(). I use that frequently, and then I modify the contents of individual indexes frequently. If Split() returned ReadOnlyCollection<string> I'd end up converting that to a string array, frequently. Likewise, any time you're working with fixed-length linear data, arrays are appropriate. However, I fully agree with the occasional scenario where arrays were used only because generics weren't available, such as in the scenario you demonstrated with, which quite honestly System.Reflection is one of few places I've ever been really annoyed that .net gave me an array. That said, my reasons were quite the opposite from yours. I wanted more mutability. For example, GetProperties returns an array and say I wanted to loop through the list and remove the items that didn't match certain criteria, such as checking for certain custom attributes. In both cases of ReadOnllyCollection<PropertyInfo> and PropertyInfo[], I have to convert to a List<PropertyInfo> to do that. Quite a pain. So in other words, no one can be happy.

  • Anonymous
    September 22, 2008
    I completely agree with Jared. Having an interface that is essentially IEnumerable<T>+this[int] {get;} (IIndexedEnumerable?) woud be the best solution for this problem. Also this interface would solve the future problem of result variance.

  • Anonymous
    September 22, 2008
    Surely this issue would be best addressed by adding a read-only array at the CLR level, or even better a recursive read-only tag for arbitrary types?  Then we could have C++ style const-ness as well...

  • Anonymous
    September 22, 2008
    I second the request for an expanded interface for read only collections (but perhaps we should take this to the BCL team instead). The more I read on this blog, the more I'm looking forward to PDC. Keep them coming Eric!

  • Anonymous
    September 22, 2008
    Superb post. Thanks for that. Story of Engineering. The very foundations and thought asssumptions can crumble at any time. It is ironic how unsuited Arrays are for anything. Most of the time I use arrays in situations where I mean "Block of information for you to look at, not touch it". Indeed that construct is utterly unsuited. Will it not be possible in the future to state that an array should be read-only? Or is it impossible for some known unknown ? Or is it even a silly question because I still don't get it? ;)

  • Anonymous
    September 22, 2008
    To Jon Davis: If you need to apply a certain criteria on the list content, you should apply a predicate on it (i.e. myList.Where(...)). In general, if you need to change the [object] stream content, produce a new stream with correct content. This will comply with goals of this article. Basically that's what LINQ does. Kosta

  • Anonymous
    September 23, 2008
    My fingers are crossed that CLR embraces Const at some point in the future.  Const arrays the enforce immutability at compile-time would be wonderful.

  • Anonymous
    September 23, 2008
    Care to share the details of "...considering using reflection to build a compiler..."?

  • Anonymous
    September 23, 2008
    ReadOnlyCollection<T> return references to object.  The object's members can be changed.  We just can't add, remove or change a reference to the ReadOnlyCollection.  So returning this collection still does not guarantee 100% readonly collection

  • Anonymous
    September 23, 2008
    I mean Add, Remove or change the reference of objects already inside the collection.

  • Anonymous
    September 23, 2008
    @Thomas, As I'm sure Eric would say (because he has so many times before), you cannot stop someone accessing private data if they have full security access; in that case you have to assume that they have access to all of the memory in the process. What matters is protecting data that is being passed to code that does NOT have full security access, in particular code that cannot use reflection. In that case, accessing private data inside your class would not be possible, but casting your return value from an IEnumerable to an array would be perfectly legal, and would give that untrusted code mutable access to your private data.

  • Anonymous
    September 23, 2008
    @chris The CLR doesn't have a comprehensive 'const' semantic like C++, and in order to maintain interoperability with other programming languages, including C#, it probably never will.   That said, you can take and appy attributes to just about everything, including return values.  You can invent 'const', or use an existing attribute for that purpose.  You could then extend each programming language to require it, and add runtime checks on the IL of callers to ensure they don't perform non-const modifications to const return values, references, and argumenst.  And you could extend existing languaged -- writing a C# compiler that issues errors if you violate const correctness.   But that's a lot of work. @Eric I appreciate that you'd like to return read-only-proxies in many cases instead of arrays, especially in 'getters'.  But Joe Duffy pointed out at (link) that proxies, such as enumerators, are often more expensive than they seem on the current JITter, more expensive than such proxies are in compiled languages (i'm thinking of vector::const_iterator, for instance).   "In almost every case, there is a better tool to use than an array" -- i'd agree.  It's unfortunate that there's a perf penalty to those better tools, but often it can be justified by the benifits (in product reliability, reduced development costs, bugs found earlier) of proxies. #aaron

  • Anonymous
    September 23, 2008
    Indeed. Attributes are the best you can do in C#. If you were writing your own language what you'd probably want to do is use an optional type modifier rather than an attribute; in C++, adding "constness" changes the signature of a method.  In the CLR, adding an attribute does not change a signature but adding an optional type modifier does. (This is how managed C++ does constness.)

  • Anonymous
    September 23, 2008
    The comment has been removed

  • Anonymous
    September 23, 2008
    The comment has been removed

  • Anonymous
    September 23, 2008
    The comment has been removed

  • Anonymous
    September 23, 2008
    @russelh: If you need minimize cache misses, usually it's enough back your IList by an array - this will bring in the needed memory management optimization without sacrificing the interface elegance. Kosta

  • Anonymous
    September 23, 2008
    I know this wasn't the focus of your post, but doesn't Moore's Law often taper off from time to time, only to kick back into action via some massive technological leap?  It happened with transistors coming from vacuum tubes, and it'll probably happen again with one of the newer areas of research, like quantum chips or DNA or whatever.  Maybe in the literal sense, transistor density won't continue to increase forever, but computing power in general probably will, and not just through the addition of more cores. On topic: as others have commented, it seems that 9 times out of 10, what I really need is a read-only, indexed sequence.  I always think, hey, let's make things as general as possible by using an IEnumerable<T>, and then realize later on that I need indexed access, and have to change it to an IList<T>, which never feels quite right because its "immutability" is just in the implementation.  And using an actual ReadOnlyCollection<T> instance never feels quite right either, I don't know why. 'Course I realized while writing this that it would be trivially easy to write an IReadOnlyList<T>, a wrapper class, and a couple of extension methods for conversion; don't know why this never occurred to me before.  Maybe because it's a stupid idea and I haven't yet realized that. :-)

  • Anonymous
    September 23, 2008
    Moore's Law is badly named. It really ought to be "Moore's Observation"; just because a particular economic trend has held for a few years does not make it a law of nature. It is the case that every time a particular technology has been tapped out, a new technology has come along to replace it. Similarly, every time we've gotten close to running out of oil, either new reserves have been discovered, or new technologies for extending existing reserves have been found.   It might well be the case that future technologies open up massive new vistas in cheap computational power. And it might be the case that all the oil we'll ever need is available for cheap in some place that we've not looked for it yet. But there is no law of physics that is going ensure that either case comes to pass, and it would therefore be unwise to assume that this will happen. If new magical technologies are invented, great. I'm pro that. But I'm going to make the conservative assumption that refinements of old technology are what we've got to work with now and going forward.   Or, look at it another way. I cannot possibly design tools for an unknown future that is radically different from today. I could guess, and likely be wrong, or I could design tools for the most likely future.

  • Anonymous
    September 24, 2008
    > Will we be seeing a "threads considered somewhat harmful" post soon? No, because I already wrote it in 2004: http://blogs.msdn.com/ericlippert/archive/2004/02/15/73366.aspx

  • Anonymous
    September 24, 2008
    I was probably overreacting, or focusing on "using" an array vs. "handing out" an array.  I guess StringBuilder is a decent example of encapsulating an array in a useful way. @Aaron G: When I was in graduate school in the 1991-1992 time frame, optical computing was supposed the next big thing.  No one (at least not my professors) thought CMOS technology would get as far as it has.  I'm not disagreeing with Eric.  Clearly, the trend now is in the direction of more and more cores (threads) per processor with constant or decreasing throughput per thread.

  • Anonymous
    September 24, 2008
    The comment has been removed

  • Anonymous
    September 24, 2008
    The comment has been removed

  • Anonymous
    September 25, 2008
    Aren't arrays actually quite good when it comes to parallelism? They're really easy to divide into chunks processed by different threads, unlike linear data structures such as lists. As you point out, they're not mutable (only the elements are mutable), which helps there (or rather, it doesn't hurt). Your comments about metadata are also shifting the blame for CLI's lack of an equivalent to const& onto arrays, which hardly seems fair. On the other hand, I think you've ignored some of the actual pedagogical problems such as jagged vs. rectangular arrays. In the spirit of "considered harmful", I consider your considerations harmful!

  • Anonymous
    September 26, 2008
    I think you forget to mention another, more important, point: having a readonly collection of references to objects won't save you from modifying objects. Thus IList<T> won't solve the problem you're mentioning! You either have to copy all objects (or use value types), or wrap all objects your array or collection contains, in a "readonlifier" wrapper. But the best thing is to design your objects (and it doesn't matter if you return them as items of an array or an collection) so that you have different interfaces for different uses. And C++'s const array modifier perfectly solves the problem you're mentioning. My 5c, I might be too sleepy to miss the whole point :) (I've recently replaced LINQ queries with direct array addressing in array to get 5x performance boost. At the same time the interface between the algorithm and the data store has remained the same. But I with all hands for your arguments regarding paralellization. Just arrays is not the key reason of why some algorithms are hard to parallel)

  • Anonymous
    September 28, 2008
    Developers in my team are encouraged (with threats of violence) to return interface types wherever possible, using the minimum interface required by the consumer, e.g. IEnumerable<T> if iteration is required, or IList<T> if indexed access is necessary. Ree: With the aforementioned move into many-core processors, the ability to parallelize is becoming more important than single-thread performance, and this is where LINQ will come into its own, especially with the PFX extensions. Roman: ConstructorInfo is an immutable class, in which instance the example is valid. Personally, I tend toward immutable classes for everything except business objects. And as for Moore's Law: what about memristors?

  • Anonymous
    September 29, 2008
    "It is important that beginning programmers understand arrays;" It seems to me that one of the main reasons for this was so that they could then quickly understand strings.  But that day is long gone.  In a world where strings are now first class citizens of their own, it should be possible to leave arrays as a concept to be introduced at the same time as other more appropriate collections.

  • Anonymous
    September 29, 2008
    It's much more important to understand the difference between O(1) and O(n), I agree :)

  • Anonymous
    September 30, 2008
    In this carnival there&#39;re a lot of software design/patterns and frameworks a bit of SOA, UML, DSL

  • Anonymous
    October 01, 2008
    @danyel:  That may be, but in that case you're still talking about something that should be wrapped in something like an Image class, and not exposed to public view. @ree:  It is generally more important to have code that is more maintainable and more amenable to reason than it is to have code that is more performant.  The fastest code serves no purpose if it's simply the fastest to produce an incorrect result, or the fastest to befuddle.  I'll take the CLR's abstractions any day.

  • Anonymous
    October 02, 2008
    @kfarmer, Part of Ree's point was that, if the CLR supported const and non-mutable arrays, you wouldn't need an abstraction to expose array-like data, and you could have the best of both worlds, maintainability and performance. Which is a perfectly valid point. Whether the gains from supporting const and immutability in the CLR would be worth the effort is another question entirely.

  • Anonymous
    October 02, 2008
    The comment has been removed

  • Anonymous
    October 02, 2008
    Most of the generic collections dealing in atomic list-type structures internally use arrays (with the exception of SortedDictionary, which uses a tree, and LinkedList, which deals with references to succeeding and preceding LinkedListNodes, and a few others). Now, unless you specify (if you are able to) an initial capacity, adding items dynamically, whether using Add() or the indexing property of the collection you are using, will trigger an allocation of a new array, and an Array.Copy(), when the item you are currently attempting to add would result in a new size that exceeds the current internal size of the collection. This can affect performance. However, what's even worse is that all of the adding/setting mechanisms in these collections trigger a check to determine whether the item is already in the internal array (usually using a binary search). Furthermore, this check forces invocations of methods in helper classes that aid in comparison and equality-calculations - this process results in pushing classes and methods onto the call stack. A lot of overhead, and ALL of these contribute to performance strain. If I have a collection and know certain things at runtime, such as how big it will be, that items I'm adding are new, etc., I choose an array to avoid all the performance stresses and overhead. I like to have a great granularity of control over how I deal in collections, so although I do agree with most of your points, I'd still be hard-pressed to pick a List over an array when I don't need to do more than basic array manipulation. On another topics about arrays, I rreally wish that the framework would allow us to be able to create an array with all elements initialized to a particular value of our choosing at creation time. For example, if I want an array of, say, 100 bools, where all elements should be set to true, I have to instantiate the array, then initialize it with a loop; it would be really helpful to just be able to write "bool[] arr = new bool100;", or something like that.

  • Anonymous
    October 02, 2008
    The comment has been removed

  • Anonymous
    October 03, 2008
    I agree, Eric, that "completely mutable, and at the same time, fixed in size" is not often a useful combination. However, an immutable array (of, typically, immutable objects) is often precisely what I want, especially where performance is an issue. In such cases, the following works: struct ConstArray<T> { public ConstArray(T[] values) { this.values = values; } public T this[int index] { get { return values[index]; } } public int Length { get { return values.Length; } } T[] values; } (It would be even better if the jitter could optimise index access in loops as it does for system arrays.) BTW, here's yet another (forlorn, I know) plea for the C++ const modifier, which, as has been pointed out, would fix this issue.

  • Anonymous
    October 13, 2008
    Would an empty array be safe to return?

  • Anonymous
    October 16, 2008
    Very recently on a project, I was having significant issues with System.IO.Directory.GetFiles, in which

  • Anonymous
    October 16, 2008
    In general I'm beginning to come to this view. Arrays don't express programmer intent in the fast majority of cases. Is there any chance that array variance could get fixed in future versions of the framework? My understanding is that becuase of it, the runtime has to type check all array operations. I could see this being a huge win in the cases that I really need an array (like image manipulation), and for all of the framework classes that are using arrays under the hood (such as List<T> I presume). Perhaps the JIT is already smart enough to avoid the penalty in most cases.

  • Anonymous
    October 21, 2008
    Just pointing out that the C++ const modifier is an illusion - it does not solve any issues regarding immutability whatsoever, courtesy of const_cast<>. So if the CLR team ever decides to add a const modifier, I'm hoping that they will not make the same mistake that the C++ standards committe did...

  • Anonymous
    October 21, 2008
    I was just noticing that arrays get nice syntax help in C#:            var a = new[] { 1, 2 };                int[] b = { 1, 2 };         while other collections have to work a little harder:            var c = new Dictionary<string, int> { {"a", 1}, {"b", 2} };     // OK            Dictionary<string, int> d = { { "a", 1 }, { "b", 2 } };         // error CS0622: Can only use array initializer expressions to assign to array types. Try using a new expression instead. Too bad. Hey, PowerShell can do it:    $myHash = @{ "a" = 1; "b" = 2 }

  • Anonymous
    October 21, 2008
    @Jay F# can do it succintly as well let d = [("a",1);("b",2)] |> Map.of_list

  • Anonymous
    October 27, 2008
    Your comments do not apply to value types, so as long as it is an array of say int, string, KeyValuePair<TKey, TValue> we are fine.

  • Anonymous
    October 29, 2008
    A while back, Eric Lippert talked about arrays being somewhat harmful .&#160; The reason being, if you

  • Anonymous
    November 09, 2008
    QUOTE:You probably should not return an array as the value of a public method or property, particularly when the information content of the array is logically immutable. Let me give you an example of where we got that horridly wrong in a very visible

  • Anonymous
    December 30, 2008
    A bit of light reading while you digest your turkey sandwiches… Fabulous Adventures In Coding : Arrays

  • Anonymous
    December 31, 2008
    Interesting discussion. As someone who moved from Algol 68 to Pascal to C to C++ to VB6 to VB.NET to Perl to Ruby to C++ with CLR I think I've seen a few programming paradigms in my time. I've also seen my share of dogma 'thou shalt not do this' or 'do this or else'. Dogma is dangerous in real life and while programming. The important thing is to understand why certain mechanisms are 'good' or 'bad' or even 'dangerous' and then decide what's appropriate for your application. As I'm sure you all do I try to use the right tool for the job. A quick test script that needs to run on Windows and Linux is easily written in Ruby. A telescope control program that wakes up every second to see if a new event needs to be recorded works well with VB.NET. Now I'm working on astronomical image processing software. This software needs to zip through images consisting of millions of floats or ints. I use C++ for that with classes that hide mapping from col/row coordinates to array indexes. Because I use inline classes I get the benefit of immutability of the array and encapsulation in general while still maintaining a semblance of performance. Another poster mentioned that dividing arrays in sections makes good sense for multi-core optimization. I agree with this, for my application anyway. Divide and conquer strategies will work well in this case. Anyway, I'm not ready yet to give up on arrays for storing large amounts of sequential data. It seems silly to store 6 million pixel values in a linked list. But it does make sense to protect that chunk of data with a class that can be unit tested.

  • Anonymous
    January 13, 2009
    I find in c# I only use arrays for speed. I use them for cross-referencing data, or anywhere performance is important. Before I read this blog entry I was asking a colleague, does anybody use arrays in c#? Because you're right, there's usually better tools available for what you're trying to do.

  • Anonymous
    January 14, 2009
    The comment has been removed

  • Anonymous
    January 23, 2009
    The big question for some applications is actually not whether or not the mutability of the array becomes an issue due to misuse, but whether or not the mutability becomes a compiler issue during threading. In other words, can the mutability of an array influence multi-threading access speed because the compiler wronfully deems it necessary to perform synchronization? This becomes quite important in cases where the threads are rather large but you know that they will only perform read operations, and the question is whether or not the compiler and the JIT will be able to recognize this and ditch synchronization. I have been unable to find any answers to this so far, but i plan to do some testing in the near future.

  • Anonymous
    February 11, 2009
    Mutability of arrays aside, I tend to use my own IReadonlyCollection interface by way of an adapter for the ReadonlyCollection. You can get my code from http://nicksdotnet.blogspot.com/2009/01/interfacing-readonlycollection.html

  • Anonymous
    February 23, 2010
    I like reviving threads over a year after they were last visited.  Anyhow, are there any active team discussions around having the convenience of an array combined with the beauty of immutability?

  • Anonymous
    January 24, 2011
    Well of course, its just an extension of interface segregation principle. Matter of fact, even IEnumerable<T> is often used more than it needs to be. A lot of times you have no desire to give your API's user an ability to enumerate through a collection, you merely want to allow them to call methods on every element - something that should be encapsulated in a Composite pattern

  • Anonymous
    January 24, 2011
    I'm not saying that I agree particularly much with the BCL teams decision to have "IsReadOnly" on ICollection (and hence IList). But that decisions at least makes returning a readonly IList valid and actually it fit well with the rest of the framework. Now I would have liked both "obviously" read-only containers and separated input/output streams, but that's more taste than anything else. Great post anyhow.

  • Anonymous
    January 24, 2011
    > Arrays simply do not model any problem that I have at all well Then, of course, don't use arrays. However, arrays (vectors and matrices) do model problems that scientists and engineers have, and nothing else models them as well or efficiently. So to call arrays harmful is to exclude a large and important class of applications simply because you don't happen to work on them. It would be more reasonable to note there is a need for immutable arrays not met by our current languages.

  • Anonymous
    January 24, 2011
    If you answer and then explain you're a good person. If you explain instead of answering you're a weasel, tut.