Udostępnij za pośrednictwem


The Truth About Value Types

As you know if you've read this blog for a while, I'm disturbed by the myth that "value types go on the stack". Unfortunately, there are plenty of examples in our own documentation and in many books that reinforce this myth, either subtly or overtly. I'm opposed to it because:

  1. It is usually stated incorrectly: the statement should be "value types can be stored on the stack", instead of the more common "value types are always stored on the stack".
  2. It is almost always irrelevant. We've worked hard to make a managed environment where the distinctions between different kinds of storage are hidden from the user. Unlike some languages, in which you must know whether a particular storage is on the stack or the heap for correctness reasons.
  3. It is incomplete. What about references? References are neither value types nor instances of reference types, but they are values. They've got to be stored somewhere. Do they go on the stack or the heap? Why does no one ever talk about them? Just because they don't have a type in the C# type system is no reason to ignore them.

The way in the past I've usually pushed back on this myth is to say that the real statement should be "in the Microsoft implementation of C# on the desktop CLR, value types are stored on the stack when the value is a local variable or temporary that is not a closed-over local variable of a lambda or anonymous method, and the method body is not an iterator block, and the jitter chooses to not enregister the value."

The sheer number of weasel words in there is astounding, but they're all necessary:

  • Versions of C# provided by other vendors may choose other allocation strategies for their temporary variables; there is no language requirement that a data structure called "the stack" be used to store locals of value type.
  • We have many versions of the CLI that run on embedded systems, in web browsers, and so on. Some may run on exotic hardware. I have no idea what the memory allocation strategies of those versions of the CLI are. The hardware might not even have the concept of "the stack" for all I know. Or there could be multiple stacks per thread. Or everything could go on the heap.
  • Lambdas and anonymous methods hoist local variables to become heap-allocated fields; those are not on the stack anymore.
  • Iterator blocks in today's implementation of C# on the desktop CLR also hoist locals to become heap-allocated fields. They do not have to! We could have chosen to implement iterator blocks as coroutines running on a fiber with a dedicated stack. In that case, the locals of value type could go on the stack of the fiber.
  • People always seem to forget that there is more to memory management than "the stack" and "the heap". Registers are neither on the stack or the heap, and it is perfectly legal for a value type to go in a register if there is one of the right size. If if is important to know when something goes on the stack, then why isn't it important to know when it goes in a register? Conversely, if the register scheduling algorithm of the jit compiler is unimportant for most users to understand, then why isn't the stack allocation strategy also unimportant?

Having made these points many times in the last few years, I've realized that the fundamental problem is in the mistaken belief that the type system has anything whatsoever to do with the storage allocation strategy. It is simply false that the choice of whether to use the stack or the heap has anything fundamentally to do with the type of the thing being stored. The truth is: the choice of allocation mechanism has to do only with the known required lifetime of the storage.

Once you look at it that way then everything suddenly starts making much more sense. Let's break it down into some simple declarative sentences.

  • There are three kinds of values: (1) instances of value types, (2) instances of reference types, and (3) references. (Code in C# cannot manipulate instances of reference types directly; it always does so via a reference. In unsafe code, pointer types are treated like value types for the purposes of determining the storage requirements of their values.)
  • There exist "storage locations" which can store values.
  • Every value manipulated by a program is stored in some storage location.
  • Every reference (except the null reference) refers to a storage location.
  • Every storage location has a "lifetime". That is, a period of time in which the storage location's contents are valid.
  • The time between a start of execution of a particular method and the method returning normally or throwing an exception is the "activation period" of that method execution.
  • Code in a method can require the use of a storage location. If the required lifetime of the storage location is longer than the activation period of the current method execution then the storage is said to be "long lived". Otherwise it is "short lived". (Note that when method M calls method N, the use of the storage locations for the parameters passed to N and the value returned by N is required by M.)

Now we come to implementation details. In the Microsoft implementation of C# on the CLR:

  • There are three kinds of storage locations: stack locations, heap locations, and registers.
  • Long-lived storage locations are always heap locations.
  • Short-lived storage locations are always stack locations or registers.
  • There are some situations in which it is difficult for the compiler or runtime to determine whether a particular storage location is short-lived or long-lived. In those cases, the prudent decision is to treat them as long-lived. In particular, the storage locations of instances of reference types are always treated as though they are long-lived, even if they are provably short-lived. Therefore they always go on the heap.

And now things follow very naturally:

  • We see that references and instances of value types are essentially the same thing as far as their storage is concerned; they go on either the stack, in registers, or the heap depending on whether the storage of the value needs to be short-lived or long-lived.
  • It is frequently the case that array elements, fields of reference types, locals in an iterator block and closed-over locals of a lambda or anonymous method must live longer than the activation period of the method that first required the use of their storage. And even in the rare cases where their lifetimes are shorter than that of the activation of the method, it is difficult or impossible to write a compiler that knows that. Therefore we must be conservative: all of these storage locations go on the heap.
  • It is frequently the case that local variables and temporary values can be shown via compile-time analysis to be unused after the activation period ends, and therefore can be treated short-lived, and therefore can go onto the stack or put into registers.

Once you abandon entirely the crazy idea that the type of a value has anything whatsoever to do with the storage, it becomes much easier to reason about it. Of course, my point above stands: you don't need to reason about it unless you are writing unsafe code or doing some sort of heavy interoperating with unmanaged code. Let the compiler and the runtime manage the lifetime of your storage locations; that's what its good at.

Comments

  • Anonymous
    September 29, 2010
    Very interesting read... thanks for this.

  • Anonymous
    September 29, 2010
    A really useful and solid post that I can see being the target of links from StackOverflow for years to come :) Tiny typo: "its" in the last sentence wants to be "they're", I think?

  • Anonymous
    September 29, 2010
    "There are three kinds of storage locations: stack locations, heap locations, and registers." Not trying to be a pedant but just out of interest do you consider the compile time known strings (or indeed any other such 'baked in' reference types) to be "heap locations". I assume fixed buffers within structs despite looking superficially like an array would be treated as not being reference types but instead simply a pointer to the interior of the struct and thus inherit their storage rules by whatever happens to their parent. (stackalloc buffers follow from your statements on pointers without any special cases) Would you consider thread statics to be be considered (opaque) sugar around a stack location (even if just the threadid) and (possibly several) heap location(s).

  • Anonymous
    September 30, 2010
    "It is frequently the case that array elements, fields of reference types, locals in an iterator block and closed-over locals of a lambda or anonymous method must live longer than the activation period.....so must go to the heap" Is the type the major driving factor in deciding what the lifetime of the value would be? How does the CLR decide what the required lifetime is? Based on the above statement, looks like this has been derived by  observing how types are used. Is there a particular logic that the CLR follows to determine the lifetime along with just looking at the type?

  • Anonymous
    September 30, 2010
    As a followup to my question, can we game the system?  meaning, can I include something in my program to make the CLR think a particular value goes to the heap instead of the stack/

  • Anonymous
    September 30, 2010
    The comment has been removed

  • Anonymous
    September 30, 2010
    The comment has been removed

  • Anonymous
    September 30, 2010
    Well, what you're saying is perfectly correct. However from a developer perspective, the most important difference to know about value types and reference types is that value types get copied in the stack when they're passed as arguments. Knowing that the developer must avoid creating big value types that are copied inefficiently. So from that perspective it's practical for developer to think that value types are stored in the stack.

  • Anonymous
    September 30, 2010
    That was really interesting and useful. Understanding storage allocation w.r.t. lifetime makes a lot more sense than mapping them to value/reference types as the later eventually leads to confusion. Just wanted to clarify, an instance of a struct with a reference type as a field will be stored on the stack (in a typical situation, minus the exceptions) and the reference to the reference type will be stored on stack too. Is that accurate? (Because that is what I understood after reading the three rules in your previous blog blogs.msdn.com/.../the-stack-is-an-implementation-detail-part-two.aspx)

  • Anonymous
    September 30, 2010
    "It is frequently the case that array elements, fields of reference types ...<snip>... Therefore we must be conservative: all of these storage locations go on the heap." Does that mean if I have an array of ints in a local method, the ints in the array go on the heap?

  • Anonymous
    September 30, 2010
    @DaRage: Surely what's important is that the value is copied - not where it's copied to and from. If the value were being copied to the heap instead of to the stack, would that make it okay to have huge value types? Learning about the copying behaviour of arguments (and simple assignments) is obviously important, but I think the detail of heap/stack allocation is a distraction there.

  • Anonymous
    September 30, 2010
    Excelent post, Eric! By the way, I found your blog while searching for this subject (your older post). I think most of this "local variables are allways stored on the stack" conviction comes from unmanaged world. One time, in an interview, one guy asked my about this, knowing that my primary programming language is C#. I really never cared about this until this situation, just because I choosed for a managed language and I can live with the idea that CLR is there to choose a better way to JIT my code. When I saw your post about "stack/heap storage is an implementation detail", it sounded like music for me. It's nice to know implementation details and how you can use them to get more performance, but I really don't think you need to close your mind to one idea. Recently I'm started working with C++ and unmanaged environment. And I'm currently observing some differente cultural aspects about the two worlds. Usually the C/C++ developers is more familiar with low-level programming. With drivers and embedded systems and applications more closer to the operational system. And they are really worried about how the assembly code is generated and it's performance impacts. You allways falls in discussions like "inline or not inline", "template or not template". No, no inheritance, because vtable function call indirections will get a performance trouble. All of them is good questions, but sometimes I really don't think the benefits pay the costs. In other words. I think this phylosophyical questions about performance x portability x control is what make these cultural differences and still yet some resistance about managed environments. The "I allways need full control" x "I like building blocks" questions. Anyway, thanks for the precious information and the great content in your blog. Regards, Eric Lemes

  • Anonymous
    September 30, 2010
    I agree with DaRage - it's not just about doing pointer arithmetic, it's also about not doing anything dumb.  If it really doesn't matter, then why do Value Types exist at all?  Why not just make everything a Class or a ensure that it's a long-lived heap variable, or better yet, let the compiler and runtime do what they're good at? An interesting read, nonetheless.

  • Anonymous
    September 30, 2010
    I appreciate all the technical details and deep insight, I have learned a lot more about their behavior and relationship with the rest of the ecosystem, however, I'm still left with the question "what IS a value type", as in a single statement that begins with "a value type is"...

  • Anonymous
    September 30, 2010
    Very good post. I'd make a small change though. In the sentence: (Note that when method M calls method N, the use of the storage locations for the parameters passed to N and the value returned by N is required by M.) I'd change "required by M" to "required by both M and N". It's just as correct, but makes it just a little bit clearer because you don't have to think "Which was was M again?"

  • Anonymous
    September 30, 2010
    The comment has been removed

  • Anonymous
    September 30, 2010
    DaRage, I agree with Jon. What the programmer need to know is that value types is copied by VALUE, no matter if it is at stack

  • Anonymous
    September 30, 2010
    Where does a value type get stored if it's large enough to qualify for the Large Object Heap?  (Not that a struct that large would be a good idea, but for the sake of completeness.)

  • Anonymous
    September 30, 2010
    After having posted the my previous comment, I've come up with its summary. Of course formally C# lacks the notion of stack. But it lacks this notion not in a storage-agnostic, but in storage-indeterministic way. In this condition of indeterminacy, the developers have invented a number of abstractions which help them to write programs capable of handling predictable sizes of input data. And stack is one of these abstractions. So we can say that C# itself in its indeterminacy gave birth to the notion of stack. Here we are left with an important philosophical question. If C# itself gave birth to the notion of stack, shouldn't we consider the stack to be an integral part of C#?

  • Anonymous
    September 30, 2010
    The comment has been removed

  • Anonymous
    September 30, 2010
    @DaRage Value types are always copied (barring ref semantics but the you're passing the something else instead) no matter where they reside, if they are within an array and you assign the value in index 1 to the value in index 2 you create a copy of what is in 1 and place it in 2. There is no requirement that this goes via the stack it is an excellent candidate for staying in registers or even being dealt with by some very funky logic in the CPU implementing a rapid memcpy. Understanding this is a help in understanding some of the reasons that making mutable value types is a very bad idea in most situations.  

  • Anonymous
    September 30, 2010
    @DaRage: Suppose you have a class with two fields, and both of those fields are of some large value type. What do you think happens when you write field2 = field1; ? It copies the value of field1 into field2... possibly via a stack value (I don't know, and it's frankly irrelevant). Copying happens, and both values are on the heap (probably; see all Eric's caveats about implementation). In what way does your claim that "copying only happens in the stack and never on the heap" hold up? You seem to have missed the point of Eric emphasizing the "this is for the desktop CLR" bit - it's trying to make it very clear that this is implementation specific, rather than being inherently part of C#. That's an important point to understand, even if you also want to know what happens to go on in the desktop CLR.

  • Anonymous
    September 30, 2010
    "In this condition of indeterminacy, the developers have invented a number of abstractions which help them to write programs capable of handling predictable sizes of input data. And stack is one of these abstractions." The stack is one of these abstractions the developers of the implementation created. There are other alternatives. They may all be stupid, and the stack may be the best one, but there is nothing that necessitates the stack. C# has not given birth to the notion of the stack, it has given birth to the need for some storage mechanism which the implementation takes care of. In cases where you really need to squeeze out performance, it may be worthwhile knowing where the implementation stores the data. But these are edge cases, and the story of where it's actually stored is complicated. There is no need to pretend that the storage location is an important distinguishing factor between value and reference types. The conceptual difference of values being unchangeable values like '10' and reference types being … well anything else (at least conceptually); the fact value types store the data, while reference types store references to the data; the implications this has for allocation, and how copying a value type is likely a larger task than copying a reference; these things are enough to be teaching people for them to make a good distinction between when to use a value and a reference type, based on conceptual semantics and performance characteristics, without caring whether it's stored on the stack, the heap, or in your mothers porridge.

  • Anonymous
    September 30, 2010
    >value types are stored on the stack when the value is a local variable or temporary Does it require "or a field of stack-stored variable"?

  • Anonymous
    September 30, 2010
    @[ICR] >"The stack is one of these abstractions the developers of the implementation created." That is true. But in fact stack+heap abstraction can be used to reason about all possible kinds of implementations. We can think of stack+heap as of a 'storage model' (analogy with memory model intended). With such a model at hand (which defines likely small stack, likely large heap, allocation strategies and stuff) developers can reason about their programs' correctness. If in additing the language implementaion in use defines how the model abstractions are mapped to hardware and software resources (e.g address space regions, registers and stuff) then the developers can reason of their programs in concrete numbers (e.g. we need X Mb RAM to handle Y Mb input) and can reason of their programs' performance as well (knowing which model abstractions are implemented in the most efficient way). C# didn't have a storage model, so the developers invented an unofficial, folklore one. The desire to have a storage model is natural, so all these stack+heap talks come not from ignorance but rather from wisdom.

  • Anonymous
    September 30, 2010
    The comment has been removed

  • Anonymous
    September 30, 2010
    The comment has been removed

  • Anonymous
    September 30, 2010
    @petebu You are reasoning with a C++ background IMO where you need to know more on how things really work at a low level. With C# and the CLR,as you say, a programmer has to be able to reason about the lifetime of objects. That is completely true. But that does not mean that he needs to know how the heap or stack are involved. Like Eric said, that is an implementation detail. No one is saying that you shouldn't know when or when not to use IDisposable, when to use value types or reference types, etc. What isn't so clear to me and to others is if this knowledge necessarily needs to be tied with how to the CLR manages the stack, heap, registers etc. Its basically the same paradigm as general OOP. You need to know how to use an object and what methods are better for different purposes, but you dont actually need to know how the object works internally.

  • Anonymous
    October 01, 2010
    The comment has been removed

  • Anonymous
    October 01, 2010
    @Eric. About CPS. Of course you are right that you can have a separate stack for return addresses and local vars. But if you CPS transform the program, all you've done is moved the return address into the same structure (the activation frame) as the local vars. There is still a call stack hidden away there. The return is still there too (a jump to the return address). I still don't see how you can implement subroutines without it. But that is besides the point. Forget how it is implemented. A total beginner looking at his first program will see that when he calls a method, all the local vars disappear and that when the subroutine returns, a) local vars reappear with the same values as before and b) the computer knew where in the caller code to return to. In the other words the computer saved the local vars and return address. Now he notices that subroutine calls can be nested. So the computer must save a list of return addresses and local vars. Furthermore that list is only accessed in LIFO order. That is, it is a stack. This behaviour exists in all languages with subroutines (with the exception of those where vars have dynamic scope by default eg. Emacs Lisp). That's what I mean by a stack (not necessarily a concrete data structure) - all programmers must have a notion somewhat like this, so it is not just an implementation detail. Perhaps I am wrong on this so it's interesting to hear what others think. One could refrain from calling it a stack and talk about the "LIFO order of method activation". Or perhaps say that local variables have "lexical scope and dynamic extent/lifetime" but I am not sure if that helps (one then has explain the precise meaning of "dynamic"). About registers. You said that "since the method call is never going to return then clearly I do not need to store the register values before the continuation is invoked because those register values are never going to be used again". When you invoke the continuation (that is, you jump to return address - they are the same thing) you will have to restore the registers in use before you entered the subroutine. But the point I was trying to make is that registers are a bit of a red herring here. Your entire blog post could have been written without mentioning registers. The CLR doesn't have them and there is no need to reach below the CLR to make your argument. About short/long-lived references. My main objection is to these two terms. "Short-lived" and "long-lived" relative to what. Is it measured in seconds, CPU cycles, method activation periods? The GC might run during the current method call - then the contents of a long-lived location might be freed before the local short-lived locations. The important thing about the stack-location isn't that it is relatively short-lived compared to a heap location but that the lifetime is a) deterministic and b) corresponds to method activations. For heap locations, the important idea is that (in a GCed language) the lifetime is indeterminate. If you don't like calling them stack and heap locations you need a better name that conveys the lifetime. Perhaps, "locations with dynamic lifetime" (these exist during the dynamic lifetime of a method activation) and "locations with indeterminate lifetime" (these exist until an indeterminate point in the future when the GC runs) ? Btw, I agree with your original blog post that we should concentrate on what is observable not on how it is implemented. When someone asks about the difference between value types and reference types one of the key differences that doesn't get mentioned much is that a value type represents the set of values that you think it does whereas a reference type represents the set of values that you think it does union the set containing null. Once we forbid nulls and make heavy use of immutable objects then the observable distinction between value types and reference types begins to fade away. This happens in F# for instance (unfortunately nulls can sneak in from C# code).

  • Anonymous
    October 01, 2010
    CunningDave: "then why do Value Types exist at all?  Why not just make everything a Class or a ensure that it's a long-lived heap variable, or better yet, let the compiler and runtime do what they're good at?" Because Value Types represent something semantically very different than Reference Types. Value Types do not have Identity. Reference Types do.

  • Anonymous
    October 01, 2010
    The comment has been removed

  • Anonymous
    October 03, 2010
    Is it safe to say that reference types always go on the heap?

  • Anonymous
    October 03, 2010
    Vince: it is safe to say that, in current Microsoft .NET implementation of CLR, storage referenced by values of reference types is always allocated on the heap.

  • Anonymous
    October 03, 2010
    In case we didn't already have enough confusion between the x86 stack (PUSH and POP instructions and the ESP register), the stack of nested method activations (which as Eric points out doesn't actually have to use the x86 stack), the register file (locations in which become associated and dissociated with formal x86 registers such that even local variables which aren't assigned to registers by the JIT might in fact be held in the register file), and Stack<T> instances, let me remind you about this thing called an "MSIL operand stack" which IS part of the .NET specification. While there may be no requirement that a data structure called "the stack" be used to store local variables, in MSIL all method parameters are required to be placed onto "the operand stack", and of course that says absolutely nothing about the machine code finally generated by the JIT, which could even (during inlining) eliminate parameters which do not vary at runtime and not store the parameters anywhere.

  • Anonymous
    October 03, 2010
    The comment has been removed

  • Anonymous
    October 03, 2010
    The comment has been removed

  • Anonymous
    October 04, 2010
    I wish I'd see this post a few days back...  someone should have sent petebu off to learn about Scheme's call/cc.   There's always a stack, and functions lifetimes are LIFO?   Pfft.

  • Anonymous
    October 04, 2010
    Jon, coming at this from the other direction -- how then would you describe the different between value and reference types?

  • Anonymous
    October 04, 2010
    Types may not mandate specific storage, but they do impose constraints on what storage strategies makes sense to implement, because type defines a set of operations on it's values and their semantics. Reference types in C# have 2 properties

  • Mutable values with changes visible via all variables pointing to the object
  • Reference equality operation (do these 2 variables point to the same object?) Without these operations one could consider some copying strategies -- e.g. allocate all objects on the stack, then migrate to heap objects references to which were stored in longer lived objects. (Also migrate heap objects to thread-local heaps, node-local heaps in distributed systems etc). "Migration" could be done by simple copying. The 2 constraints I mentioned above make such strategies either not efficient (need to update all references to point to the new copy -- mini-gc) or not usable in general case (JVMs do escape analysis and stack allocate objects which are provably not leaked outside). Semantics of value types in C# was carefully crafted to allow straightforward allocation of their values on the stack.
  • Anonymous
    October 05, 2010
    I had an interview question on this recently, some blah blah about where value types and reference types are stored. I thought that the answer "Who cares?" is probably not what they were looking for so give the usual, and aparently wrong, schpiel. I didn't get the job but I wish I had their email addresses to send them this link.

  • Anonymous
    October 05, 2010
    @Brian, imagine for a second they read this blog and therefore wanted you to say "Who cares? It's an implementation detail!"

  • Anonymous
    October 05, 2010
    Why have value types at all? What's the point?

  • Anonymous
    October 05, 2010
    JeffC: Imagine that you have a bitmap manipulation class. It keeps an array of Pixel instances, where each Pixel contains an Opacity instance and 3 Color instances: struct Opacity { byte Value; } struct Color { byte Value; } struct Pixel { Opacity Opacity; Color Red, Green, Blue; } Since those are all value types, the image from my 12 megapixel camera will take 12M * 4, or 48MB. Each pixel will be stored adjacent to its neighbor in memory and the values are easy to access. If those were reference types, creating the array of 12M of them would allocate 48MB of memory just for references to Pixels (96MB on a 64-bit machine). Then you would have to loop through all 12M of the array elements to create 12M Pixel instances, each with at least 20 bytes (4 references plus overhead, on a 32-bit machine), for an additional 240MB. Of course for each pixel, you need to create 3 Colors and an Opacity, each being at least 8 bytes, adding 384MB more. So now your pixel array takes up 670MB instead of 48MB, the pixels are nowhere near each other in RAM (causing lots of page faults and cache misses), and you have to follow two references to get each value. Now do you see why we have value types?

  • Anonymous
    October 05, 2010
    very good post. n thanx for awareness about it.

  • Anonymous
    October 05, 2010
    I just wonder how important this (stack or no stack) could be for normal development?

  • Anonymous
    October 05, 2010
    The comment has been removed

  • Anonymous
    October 05, 2010
    The comment has been removed

  • Anonymous
    October 07, 2010
    but if I give the complete answer, there won't be time to ask any other questions in the interview, man.

  • Anonymous
    October 08, 2010
    The myth is proved the belief that people had till day! - nice article. Like old say "Look and make sure what you see, and dont believe on rumors"

  • Anonymous
    October 17, 2010
    Awesome read! Thanks for sharing the thoughts

  • Anonymous
    October 24, 2010
    Eric, thanks for the interesting information :) Common question on the interviews about where are value types stored now becomes very interesting. :) Could you also introduce printer-friendly version of your blog?

  • Anonymous
    October 24, 2010
    Very useful and detailed analysis - thanks the author for that! Although I wouldn't be agree with the preceding comment that stands the change to "required by both M and N" is "just as correct": the actual scope of N concern includes allocation of its own automatic variables, referential types etc. that is obviously wider than the passed parameters and return value. Thus the original statement makes more sense for me...

  • Anonymous
    October 25, 2010
    So, just to clarify, when we allocate array of value types, it is allocated on heap, ok? But what does this array contain? Does it contain references to boxed values? Or does it contain value type values? Work it out from first principles. What is an array? An array is a collection of variables of a particular type, called the element type. What do we know about variables of value type? A variable of value type contains the value. (Unlike a variable of reference type, which contains a reference to the value.) Therefore there is no boxing; why would there be? An array of ints is a collection of variables of type int, not a collection of variables of type object. You seem to still be reasoning from the fallacy that "anything on the heap is always an object". That is completely false. What is true is that variables are storage locations, and that storage locations can be on the stack or the heap, depending on their known lifetimes. - Eric If it contains value type values, how does runtime know what type of values resides in the array? Well, how does the runtime know that a field of a class is of type int? A class (or struct) is a collection of variables (called fields); an array is a collection of variables (called elements). The runtime can somehow get type information from the object about what the type of one of its variables is. How it does so is an implementation detail. - Eric

  • Anonymous
    October 25, 2010
    @Dmitry The reason is because any reference to an element of that array must come from either: int[] ints = new[1000000]; // compile time known // the compiler knows the types and anyway var x = ints[20]; // esoteric: pointers in unsafe context. again compiler knows the type fixed (int* p = ints[20]&) {} // runtime known Array a = ints; object o = ints.GetValue(20); // here is the runtime checking int i = (int)o; // unboxing occurs, you must use int and not, say, long in that last example Array.GetValue requires an object return type but since it is an int array just 'grabbing' the value at offset IntPtr.Size from the start of the array's data section won't work. Instead it uses TypedReferences (which are basically two pointers, one to the value, one to the type it is) and the CLR supplies a function on Array to ensure that it gets the right pointer based on the type of the array, which is known because an array is an object, and just like all other objects it has a record in it's object header that contains a pointer to it's type (used for reflection, vtables and the like). This function is (as of 2.0) [MethodImpl(MethodImplOptions.InternalCall)] private extern unsafe void InternalGetReference(void* elemRef, int rank, int* pIndices); will have backing code which does the runtime type checking. The second function involved is on TypedReference: [MethodImpl(MethodImplOptions.InternalCall)] internal static extern unsafe object InternalToObject(void* value); This will do the job of boxing the resulting value as the right type (in this case a boxed int) rather than just passing it along as a reference if the array was, say string[]. Therefore you can always get the appropriate type based on the array itself.

  • Anonymous
    October 26, 2010
    Thanks for an excellent article and excellent comments, including the debates. I know that some people get turned off by the criticisms in the comments, but honestly, it helps me to learn more about this to read a really well-thought out debate.

  • Anonymous
    October 29, 2010
    Great perspective, and a very helpful explanation -- But I thought it would be good to share the message that I take away from reading this article: From a game programmer's point of view, focusing on creating highest performance code, I should not use C# to produce the most efficient code because I have no way of telling where my data is being stored and what impact it will have on garbage collection performance. Instead, I should simply stick with a non-managed language to ensure predictable performance derived from known memory management. Haha, so that statement was quite extreme: I'm mostly saying this because I would love to hear these type of perspectives, like what was shown in this article, balanced with a bit more detail on the performance impacts of taking this point of view of the language. In reality, if I was programming for the Xbox 360, I would actually study the specific implementation details in order to create code that would be "performance friendly" for that specific platform, even though this might be against the "nature" and goals of the managed C# language.

  • Anonymous
    November 02, 2010
    The comment has been removed

  • Anonymous
    November 09, 2010
    Excellent post Eric, can you please shed some light on static variables, static methods, static classes storage.

  • Anonymous
    December 09, 2010
    The comment has been removed

  • Anonymous
    December 14, 2010
    Off topic, but I find these things "in the Microsoft implementation of C# on the desktop CLR, value types are stored on the stack when the value is a local variable or temporary that is not a closed-over local variable of a lambda or anonymous method, and the method body is not an iterator block, and the jitter chooses to not enregister the value", much easier to understand if I imagine Anders saying them.

  • Anonymous
    January 17, 2011
    Isn't the title a bit melodramatic... Sounds like a 1950's black and white

  • Anonymous
    January 17, 2011
    Isn't the title a bit melodramatic... Sounds like a 1950's black and white

  • Anonymous
    February 06, 2011
    Very good post. I think that we should always use "Thread Stack" instead of "stack" in sentences. It will be helpful for the beginers.

  • Anonymous
    April 14, 2011
    Knowing I am working on an 8 way Xeon this discussion seems rather silly.  There are about 8000 registers, 256 memory indexers, 64 scoreboards, 128MB of various cache.   Per Intel 98% of things loaded in registers are never used (mostly predictive branching). This means the stack and heap are at about 400 place for every clock cycle (assuming you could acually keep the processors loaded). Hiding this mess from the user is a real fine idea.

  • Anonymous
    May 19, 2011
    Of course now you need to add async method to your list of weasle words.

  • Anonymous
    March 29, 2012
    One (and only) of the best truths that I have ever enjoyed reading !!

  • Anonymous
    November 10, 2012
    Excellent article. Thank you!