Jaa


Why do ref and out parameters not allow type variation?

Here's a good question from StackOverflow:

If you have a method that takes an "X" then you have to pass an expression of type X or something convertible to X. Say, an expression of a type derived from X. But if you have a method that takes a "ref X", you have to pass a ref to a variable of type X, period. Why is that? Why not allow the type to vary, as we do with non-ref calls?

Let's suppose you have classes Animal, Mammal, Reptile, Giraffe, Turtle and Tiger, with the obvious subclassing relationships.

Now suppose you have a method void M(ref Mammal m). M can both read and write m. Can you pass a variable of type Animal to M? No. That would not be safe. That variable could contain a Turtle, but M will assume that it contains only Mammals. A Turtle is not a Mammal.

Conclusion 1: Ref parameters cannot be made "bigger". (There are more animals than mammals, so the variable is getting "bigger" because it can contain more things.)

Can you pass a variable of type Giraffe to M? No. M can write to m, and M might want to write a Tiger into m. Now you've put a Tiger into a variable which is actually of type Giraffe.

Conclusion 2: Ref parameters cannot be made "smaller".

Now consider N(out Mammal n).

Can you pass a variable of type Giraffe to N? No. As with our previous example, N can write to n, and N might want to write a Tiger.

Conclusion 3: Out parameters cannot be made "smaller".

Can you pass a variable of type Animal to N?

Hmm.

Well, why not? N cannot read from n, it can only write to it, right? You write a Tiger to a variable of type Animal and you're all set, right?

Wrong. The rule is not "N can only write to n". The rules are, briefly:

1) N has to write to n before N returns normally. (If N throws, all bets are off.)
2) N has to write something to n before it reads something from n.

That permits this sequence of events:

  • Declare a field x of type Animal.
  • Pass x as an out parameter to N.
  • N writes a Tiger into n, which is an alias for x.
  • On another thread, someone writes a Turtle into x.
  • N attempts to read the contents of n, and discovers a Turtle in what it thinks is a variable of type Mammal.

That scenario -- using multithreading to write into a variable that has been aliased -- is awful and you should never do it, but it is possible.

UPDATE: Commenter Pavel Minaev correctly notes that there is no need for multithreading to cause mayhem. We could replace that fourth step with

N makes a call to a method which directly or indirectly causes some code to write a Turtle into x.

Regardless of how the variable's contents might get altered, clearly we want to make the type system violation illegal.

Conclusion 4: Out parameters cannot be made "larger".

There is another argument which supports this conclusion: "out" and "ref" are actually exactly the same behind the scenes. The CLR only supports "ref"; "out" is just "ref" where the compiler enforces slightly different rules regarding when the variable in question is known to have been definitely assigned. That's why it is illegal to make method overloads that differ solely in out/ref-ness; the CLR cannot tell them apart! Therefore the rules for type safety for out have to be the same as for ref.

Final conclusion: Neither ref nor out parameters may vary in type at the call site. To do otherwise is to break verifiable type safety.

Comments

  • Anonymous
    September 21, 2009
    Interesting post... it feels great to know why there are certain restrictions :) waiting for more on such obscure issues... Thanks :)

  • Anonymous
    September 21, 2009
    In languages like Ada, the out parameter is modified upon exit (in which case, the variable can never be read). There's a name for this convention, which I forgot, to distinguish it from call-by-reference.

  • Anonymous
    September 21, 2009
    > That scenario -- using multithreading to write into a variable that has been aliased -- is awful and you should never do it, but it is possible. It doesn't even have to involve multithreading, so far as I can tell. It can simply be that N called some other method M on the same thread, and M then modified x before returning - which is probably a much more likely occurrence. Reference-to-const in C++ has the same problems (and is more subtle in that, because it's implicit at call site, and otherwise is a heavily used parameter passing mode, unlike C#'s relatively rate ref/out) - if I recall correctly, it's part of the reason why all STL algorithms take begin/end iterators by value, for example - since if they were passed by reference, the function passed to the algorithm could e.g. change the value of the end-iterator in the middle of iteration, with unpredictable results. On an unrelated note, there seems to be something wrong going on with captchas and registered users. If I try to post messages to any MSDN blog - including yours - while logged in, they seem to go straight into the bit bucket, with no error messages. If I sign out and post them anonymously, the same messages are posted just fine. This seems to have started when captchas got introduced - previously, it all worked just fine.

  • Anonymous
    September 21, 2009
    Didn't you already discuss this about a month or two ago? I distinctly remember reading something about this exact topic... Reread the first sentence. Follow the link. -- Eric I also distinctly remember the first time I tried to enter this comment. Nothing happened :( Bummer! -- Eric

  • Anonymous
    September 21, 2009
    Pavel - M can only modify x to be a Mammal or a subtype of Mammal, so how is type safety lost? Nice one, Eric. Thinking about it from a different angle, the rules exist to prevent ref and out from changing the (static) type of the storage location itself i.e., N demands x to be of type Mammal whereas it's declared to be of type Animal. The rules still allow the runtime type to be different - the out parameter can obviously be set to a subtype of Mammal.

  • Anonymous
    September 22, 2009
    Senthil, I believe Pavel was talking about a scenario like the one below.  It is an example of the type violation that could occur if out parameters could be made "larger". class MyAnimal
    {
       public MyAnimal()
       {
           SetMammalAndMilk(out animal);
       }
       Animal animal;
       void SetMammalAndMilk(out Mammal mammal)
       {
           mammal = new Tiger();
           SetAnimalToTurtle();
           //mammal is now a turtle
           mammal.Milk(); // You can't milk a turtle
       }
       void SetAnimalToTurtle()
       {
           animal = new Turtle();
       }
    }

  • Anonymous
    September 22, 2009
    Yes, exactly. The real problem here is aliasing semantics of out-parameters (which means they aren't really "out" in the traditional sense used by Ada or COM/Corba, and more like "ref mustinit" - but we know that already).

  • Anonymous
    September 22, 2009
    The comment has been removed

  • Anonymous
    September 22, 2009
    The comment has been removed

  • Anonymous
    September 23, 2009
    Thanks for the corrections, Eric. I had not thought about the possibility that the compiler could reject that example. I  didn't try because a) I do not have a C# compiler at home and b) it is not possible to check a standard by running a compiler (although, as in this case, it can give hints about what is going on). BTW: for those trying to look up the hint: it took me a while to find that section 5.3.3.13. The Ecma standard I looked at (<http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-334.pdf>) does a +7 for (most?) chapter numbers, compared to <http://download.microsoft.com/download/3/8/8/388e7205-bc10-4226-b2a8-75351c669b09/csharp%20language%20specification.doc>. It has "Try-catch statements" in section 12.3.3.13.

  • Anonymous
    September 23, 2009
    The comment has been removed

  • Anonymous
    September 23, 2009
    The comment has been removed

  • Anonymous
    September 24, 2009
    A good example of a language which tries "to be an OOP language" is Java - remember all the "delegates are evil, because they're not object oriented" talk in that camp back in the days of J++? - and look where they are now; still no first-class functions in the language nor any plans for them in the upcoming release, despite a heavy debate on the issue in the last 2 years or so. I'll take C#'s pragmatic hybrid approach over that every day.

  • Anonymous
    September 24, 2009
    The comment has been removed

  • Anonymous
    September 24, 2009
    @DrBlaise:   public MyAnimal()   {       SetMammalAndMilk(out animal);   } Animal animal;   void SetMammalAndMilk(out Mammal mammal)   {      .........      .........   } Not every animal is  a mamal. Is the above a valid derivation relationship?

  • Anonymous
    September 29, 2009
    I have to say, I'm rather surprised that 'out' was designed the way it was (as an alias). As we've established in this post and its comments, the current way has these drawbacks:

  • the aliased variable could be altered by a method call or another thread

  • you can't make use of polymorphism. Had it been designed the 'Ada' way, where the parameter is modified upon exit, these drawbacks do not apply. What drawbacks are there to this method? I can only think of one: the assignment upon exit slows things down. Very slightly. Are there any other drawbacks I haven't thought of? If not, then why was 'out' designed the way it was?

  • Anonymous
    January 25, 2010
    So do you all see EXTENSION METHODs as a kind of "fudge" then in case the original developer forgot to add a method? I would still like to be able to add another property to all CONTROLS like this rather than create a new UserControl using inheritance each time. I.E: See the commented out section.>> Option Strict On Imports System.Runtime.CompilerServices Module MyExtensions    <Extension()> _    Public Sub Shape(ByVal Ctrl As Control, Optional ByVal NumberOfSides As Integer = 3, Optional ByVal OffsetAngleInDegrees As Double = 0)    End Sub    Public Enum ShapeStyle        Oval        Triangle        Rectangle        Pentagon        Hexagon        Septagon        Octagon        Nonagaon        Decagon        Custom    End Enum    Private mShape As ShapeStyle    '<Extension()> _    'Public Property ControlShape(ByVal ctrl As Control) As ShapeStyle    '    Get    '        Return mShape    '    End Get    '    Set(ByVal value As ShapeStyle)    '        mShape = value    '    End Set    'End Property End Module Regards, John

  • Anonymous
    January 25, 2010
    It looks like the date and time is wrong on the comments too. Does the server time need setting?

  • Anonymous
    October 26, 2012
    Well - that's bad.. Like - anyway how to do it work (don't care unsafe or not, anyway) ?? For example - how am i gonna convert a dictionary(key, datatable) into a treeview if this stupid restriction doesn't even allow a certain indexed table to be even accessed ?? like this "lisp-like" function for example:        // This is the recursive function that makes calls and reads row by row and find/add new subtables to legend        public static void GetSubTablesByKeyRecursion(ref TreeView trwTablesList, ref Dictionary<string, DataTable> dataTableLegend, int depth, DataTable currentDataTable)        {            // check if current node is terminal            if ((depth == 0) && (GetSubTablesByKey(1, currentDataTable) == null))            {            }            else            {                foreach (DataRow row in currentDataTable.Rows)                {                    Int32 currentId = Convert.ToInt32(currentDataTable.Rows[0]["Id"]);                    GetSubTablesByKeyRecursion(ref trwTablesList, ref GetSubTablesByKey(currentId, GetDTFromLegendByIndex(dataTableLegend.Keys.Count - 1, dataTableLegend)), depth + 1, GetDTFromLegendByIndex(dataTableLegend.Keys.Count - 1, dataTableLegend));                }            }        }        // Find all subtables of a table by Id        private static Dictionary<string, DataTable> GetSubTablesByKey(object id, DataTable dtParent)        {            // all subtables will be placed in the dataset            Dictionary<string, DataTable> dataTableLegend = new Dictionary<string, DataTable>();            // find roots, place them in the "roots" key value dictionary row            DataRow[] dtRows = dtParent.Select("ParentID is null AND isDeleted == false");            DataTable dt = new DataTable();            foreach (DataRow row in dtRows)            {                dt.Rows.Add(row);            }            dataTableLegend.Add(id.ToString(), dt);            return dataTableLegend;        }        private static DataTable GetDTFromLegendByIndex(object index, Dictionary<string, DataTable> dtLegend)        {            int count = 0;            if (dtLegend != null)            {                foreach (KeyValuePair<string, DataTable> dtLegendRow in dtLegend)                {                    if (count == (Int32)index)                    {                        return dtLegendRow.Value;                    }                }                return null;            }            return dtLegend.First().Value;        } Remember - my goal is to make a TreeView datasource out of datatable.. THAT IS MY TASK.. (maybe converting the table to XML and then doing all this conversion in XML which is , - doubt any better).. So yah - that table/list is needs be converted into a tree.. Like - why do i even need to make a method for accessing a value or key by index in a dictionary ?? - that restriction too is "excessive" IMO - both are arrays/collections.. But yah - not being all that "positive" - if anyone sees this post please reply to me ASAP you can..