Chaining simple assignments is not so simple
UPDATE: I interrupt this episode of FAIC with a request from my friend and colleague Lucian, from the VB team, who wonders whether it is common in C# to take advantage of the fact that assignment expressions are expressions. The most common usage of this pattern is the subject of this blog entry: the fact that "chained" assignment works at all is a consequence of the fact that assignments are expressions, not statements. There are other uses too; one could imagine something like "return this.myField = x;" as a short cut for "this.myField = x; return this.myField;" -- perhaps we are performing some computation and then recording the results for use later. Or perhaps we've got something like myNonNullableString = (myNullableString = Foo()) ?? "<null>"; -- there are any number of ways this idiom could be used.
I do not use this idiom myself; I'm of the opinion that side effects such as assignments are best represented by putting each in a statement of its own, rather than as something embedded in a larger expression. My question for you is: do you use assignments as expressions? If so, how and why? Note that I am looking for mundane, "real world" examples of this pattern, not clever ideas about how this could in theory be used. If you've got one, please leave it in the comments and I'll pass it along to Lucian. Thanks!
*********************
Today I examine another myth about C#. Consider the following code:
a = b = c;
This is legal; you can make arbitrarily long chains of simple assignments. This pattern is most often seen in something like
int i, j, k;
i = j = k = 123;
I often hear that this works “because assignment is right-associative and results in the value of the right-hand side”.
Well, that’s half true. It is right-associative; obviously this has to be equivalent to
i = (j = (k = 123)));
It doesn’t make any sense to parenthesize it from the left. Now, in this particular example, the statement is true, but in general it is not. The result of the simple assignment operator is not the value of the right hand side:
const int x = 10;
short y;
object z;
z = y = x;
System.Console.WriteLine(z.GetType().ToString());
This prints “System.Int16”, not “System.Int32”. The value of the right-hand side of “y = x” is clearly an int, but we do not assign a reference to a boxed int to z, we assign a reference to a boxed short!
So then is the correct statement “ … results in the value of the left-hand side”?
Nope, that’s not right either, and we can prove it.
class C
{
private string x;
public string X {
get { return x ?? ""; }
set { x = value; } }
static void Main()
{
C c = new C();
object z;
z = c.X = null;
System.Console.WriteLine(z == null);
System.Console.WriteLine(c.X == null);
}
}
This prints “True / False” – the result of the assignment operator is not the value of the left-hand-side. The value of the left hand side is the empty string but the value of the operator is null.
Heck, the left hand side need not even have a value. Write-only properties are weird and rare, but legal; if there were no getter then the left hand side c.X would not have a value!
The correct statement should now be pretty easy to deduce: the result of the simple assignment operator is the value that was assigned to the left-hand side.
Comments
Anonymous
February 11, 2010
Interesting! const int x = 10; short y; object z; z = y = x; System.Console.WriteLine(z.GetType().ToString()); Shoud work themAnonymous
February 11, 2010
Maybe there's some compiler-internal aspect here that is more interesting, but I take issue with the statement that "The value of the right-hand side of “y = x” is clearly an int". In particular, I expect the value of the right-hand side of an assignment expression to be exactly the thing that is stored into the left-hand side. Thus, the right-hand side is not "clearly an int". It seems clear that, given the destination of the assignment operation, that the type of the right-hand side must be a short. True, there is implicit type conversion going on in order to allow that. Likewise this example: string str = "Hello"; object obj; obj = str; The right-hand side of the assignment must be System.Object, because that's the type of the destination. But of course, we can implicitly convert System.String to System.Object without extra code. As I said, perhaps to the compiler, the right-hand side really is some other type, and there's something about the internals of the compilation process where it's important to state unequivocally that the type isn't necessarily that of the destination. But from the point of view of the programmer without intimate knowledge of the inner workings of the compiler, it's not clear at all to me that we must consider the type of the right-hand side to be other than that of the destination type.Anonymous
February 11, 2010
Pete, I have to disagree with your post. Consider (All "T"'s are types...) T1 a = value; T3 c = a; This can be equivilant to T1 a = value; T3 F1(T1 arg) {...} Converts T1 to T3 T3 c = F1(a); Or Even T1 a = value; T2 F1(T1 arg) {...} Converts T1 to T2 T3 F2(T2 arg) {...} Converts T2 to T3 T2 b = F1(a); T3 c = F1(b); In the aove sample, it is clear that the contents of "a" are of type T1, and that the contents of "c" are of type T3. By logical extension, the RHS of the original statement is also of type T1 and only of type T1. Remember we are looking at the Right Side ONLY, ignoring any conversions (as are explicitly shown in the other samples) and NOT looking at the assignment statement. As a result, there is a world of difference between "the type of the RHS" and the "type that is assigned to the LHS" (which is NOT necessarily the declared type of the LHS!!!! Hopefully this clears things up...Anonymous
February 11, 2010
The comment has been removedAnonymous
February 11, 2010
@Robert Davis I disagree. What would be completely unexpected is that baz equals 20. If I have a chained assignment the last thing I'd expect to see is that the result of said expression is a value that doesn't even show up. The unexpected behaviour there is simply caused by a wrong property setter. As a matter of fact, shouldn't the compiler flag a empty setter property at least as a warning?Anonymous
February 11, 2010
@Grico I disagree, properties are a different animal than fields/local variables. Essentially Eric's post shows that int baz = f.Bar = 10; has an entirely different effect than f.Bar = 10; int baz = f.Bar; which I think most would find unexpected. Also, nothing wrong with an empty setter, especially if you're implementing an interface or sub-classing.Anonymous
February 11, 2010
I dont see how empty setters could help subclassing or implementing an interface. The way I see it if your setter doesnt do anything then dont implement one. If you need to because you are impelementing an interface but the best choice is to leave it empty then something is wrong with your design. As a last resort I'd throw a not implemented / not supproted exception. A setter that doesnt do anything is misleading and will only lead to unexpected behaviour as in your example. And yes of course that int baz = f.Bar = 10; has an entirely different effect than f.Bar = 10; int baz = f.Bar; one is a chained assignment, the other one isnt. The point is that intuitively when I see a chained assignment I expect everything to end up with the same value. Thats the whole point of a chained assignment and C#'s implementation does a great job in trying to do exactly that. By passing on the value assigned to the left hand side intermediate property getters are always skipped. This is great because you can never ensure that the getter will return the same value that was passed to the setter which defeats the purpose of a chained assignment. My gripe with empty setter doesnt really apply here at all now that I think about it, as the unexpected behaviour would apply anytime we have a property that gets a value that is different to the value that is passed to its setter.Anonymous
February 11, 2010
Ouch. I guess I'm glad it works this way. How batty would you go trying to figure this out otherwise... z = c.X = null; //bunches of code if (z == null) { //why do I never get here? }Anonymous
February 11, 2010
The comment has been removedAnonymous
February 11, 2010
I have always been so curious of this type of assignment and have sparingly used it because I didn't want to use it without understanding the assignment. Thanks for the clarification, Eric!Anonymous
February 11, 2010
I think I understand how the "value" side of a chained assignment works. What I'm, still trying to decypher is how the types are passed and if it abides the same rules. The example Eric uses: const int x = 10;
short y;
object z;
z = y = x; I dont quite see how this example plays according to the "value assigned to the left hand side" rule. y=x; x is clearly an int, therefore y=x should evaluate to what is assigned to y which is an int. Then z should be assigned an int too but it somehow ends up with a short. I'm either missing something basic or types dont follow the same rule values do. My guess is that z=y=x is under the hood converted to z=y=(short)x, so the conversion is before the assignment. The operation of the assignment operator is (1) evaluate the left hand side to determine the location of the variable (or property, or whatever). (2) evaluate the right hand side. (3) convert the result of step 2 to the type of the left hand side via the appropriate conversion, (4) assign the result of step 3 to the result of step 1. The value of the right hand side is the result of step 2. The value assigned to the left hand side is the result of step 3. Those can be very different. Pete seems to believe that we should consider the value of the right hand side to be the value computed by step 3, but I disagree. -- EricAnonymous
February 11, 2010
z=y=x is converted to z=(object)(y=(short)x) is what i meant to say. So basically z=(object)(short)x. Is this true? Or am i misunderstanding everything? Correct. -- EricAnonymous
February 11, 2010
I note that I'm not the compiler implementor, nor the language designer, and obviously not the one with expert knowledge in this field. That said, I can still answer questions asked of me. :) Q: "If we go with your way, and say that the type of the right hand side of an assignment is always of the type of the left hand side, then how do we word the specification clearly to figure out what conversion it is?" A: I would look to some kind of recursive definition of the conversion, such that the conversion is defined in terms of simpler conversions. Q: "Now suppose we have both M(double x) and M(short x). Are you telling me that the type of 123 is both double and short in M(123) because it could be assigned to double or it could be assigned to short?" A: No, I'm not saying that. But your question stems from a chosen approach to overload resolution, presupposing that choice. It seems to me that overload resolution could be defined to take into account the question of conversion, such that overload resolution includes an attempt to convert the expression to each of the available destinations for the assignment, and the "best" overload is in fact a consequence of the "best" conversion. That said, it seems to me that your reply is really saying that this isn't just about the compiler implementation, but rather about the language specification. Inasmuch as in the specification you do have to choose a specific way to describe these conversions, overload resolutions, etc. I am perfectly satisfied with that justification for this particular way to look at the question. In other words, given a specific choice with respect to how the language in the specification is constructed, I see how my argument doesn't apply. But, that doesn't mean that it's a technically impossible argument (the specification could have been worded differently), and as pedantic as it might be to do so, I do still disagree with the characterization of the consequences of the wording of the specification being "clear", at least in absence of a specific reference to that specification (i.e. until you mention the specification, you can't say a particular interpretation of the specification is "clear" :) ). Anyway, thanks for setting me straight. Like I said, your argument is plausible. But it turns out to be less vexing to describe the operation of the assignment as having two clearly distinct steps in the middle of its operation: computing the value of the right hand side, and computing the value that is assigned to the left-hand side. And of course, that is what really happens in the runtime; we compute the right hand side, then we run the appropriate conversion code, and then we do the assignment. -- EricAnonymous
February 11, 2010
Grico, You are confusing "the value of the RHS" and what happens as part of an assignment. When talking about RHS values, it refers to the value BEFORE any effects of the operation. Looking at it another (semantic) way. "x" is the RHS alll by itself. At no time does "x" have any other value or type (i.e. it is NOT mutated by the operation that is being performed.Anonymous
February 11, 2010
The number one rule of programming is write clearly. If the original code had been written clearly in the first place there would be no need for this discussion.Anonymous
February 11, 2010
It's not really surprising, in that it is fully consistent with C, C++ and Java, while sharing the same syntax. That makes sense to me. Those three all define the result of the assignment operator as "the value of the variable after assignment", but of course they also don't have properties. So far as I can see, when properties are not involved, there's no difference between "new value of variable", and "assigned value converted to type of variable", so where they intersect with C#, semantics are consistent; and otherwise the rule is a logical extension, preserving the spirit while taking into account the existence of write-only properties.Anonymous
February 11, 2010
Consider slightly changed piece of code: class C { private string x; public string X { get { return x ?? ""; } set { x = value + "a"; } } static void Main() { C c = new C(); object z; z = c.X = "b"; System.Console.Write(z); System.Console.Write(c.X); } } It writes "b" and "ba". Shouldn't the statement be rather " the result of the simple assignment operator is the value that was >used to be< assigned to the left-hand side" ? Can you think about another wierd example? :)Anonymous
February 12, 2010
Fusion, Whe dealing with properties, "the value that was assigned" has to be treated as "the parameter that was passed to the setter", and NOT as "the internal value of any backing field or calculation". Actions that take place within the setter (and getter) are not (and IMPO should not) be considered as they are internal implementation details.Anonymous
February 12, 2010
The comment has been removedAnonymous
February 12, 2010
The comment has been removedAnonymous
February 12, 2010
real world using assignments as expressions / chained assignments protected virtual void OnGetWindowSizes(ref short minimumWidth, ref short minimumHeight, ref short maximumWidth, ref short maximumHeight, ref short preferredWidth, ref short preferredHeight) { if (WidthAt96DPI != 0 && HeightAt96DPI != 0) { using (System.Drawing.Graphics g = CreateGraphics()) { short scaledWidth = (short)(WidthAt96DPI * g.DpiX / 96); short scaledHeight = (short)(HeightAt96DPI * g.DpiY / 96); if (AllowUserToResizeTool) { minimumWidth = preferredWidth = scaledWidth; minimumHeight = preferredHeight = scaledHeight; } else { minimumWidth = maximumWidth = preferredWidth = scaledWidth; minimumHeight = maximumHeight = preferredHeight = scaledHeight; } } } } refs due to some C++ interop.Anonymous
February 12, 2010
I don't make a habit of it in the general case, but I can think of one pattern I use that takes advantage of this feature. Consider the following pseudo-code: START: Get value IF value satisfies some predicate THEN Use value in additional processing GOTO START END The easiest way (IMHO) to express this is with code similar to the following: string input; while(input = GetInputFromUser() != "quit") { ProcessUserInput(input); } You could use a boolean return value and an out variable instead to avoid evaluating the assignment as an expression, but that is considerably less usable and readable in my opinion. Also, it seems to me that following usage of the "using" keyword would fall into this category: SqlConnection conn; using(conn = new SqlConnection(...)) { ... } Although I can't be sure the language spec doesn't special case this scenario, as it would have to if the declaration of "conn" were inside the using expression.Anonymous
February 12, 2010
The comment has been removedAnonymous
February 12, 2010
I generally agree with the sentiment to keep side-effects separate. But, like other responders, there are a handful of places where I commonly do in fact use the assignment expression value. Oddly enough, they are mostly similar to Pavel's example, fit the pattern David Nelson describes, and fall into the broader category of i/o operations. StreamReader.ReadLine() returns null reaching the end of input, Stream.Read() returns 0, likewise Socket.Receive(), in a console application, I might loop until Console.ReadLine() returns "", etc. Making these checks as the condition at the top of a "while" loop results in code that is IMHO more readable than the alternatives. Much less commonly, I do find myself doing a similar kind of thing in "if" statements. In some respects, those examples can probably be thought of degenerate, single-iteration versions of the "while" loop scenario, though in the "if" statement examples, I would say that the i/o scenario isn't quite so highly correlated. In short, it's not a construct I use broadly. But if C# didn't allow assignments to be treated as expressions, I would definitely miss that feature.Anonymous
February 12, 2010
Sather has an interesting form of loops which lets you write this in a readable way, by allowing "while" (or other iterator - it's actually extensible, and "while" is not a keyword) itself to appear in the middle of the body: loop s: STR := #IN.get_line while!(s /= void) ... end It's not really all that different from if/break at that point, but still more clear in intent, IMO.Anonymous
February 13, 2010
In the case of IO operations and loops, I think I'd rather see processing look more like this: while(!stream.EndOfStream) { string s = stream.ReadLine(); } Using the magic return value of null to indicate EOF is unclear because it requires consumers to have knowledge of that return value. I think this would also eliminate the problems Pavel mentions with calling Read() twice and using a while(true)/break construct. Of course, there is not currently a Stream.EndOfStream property, nor do I have any idea what would be required to make that happen, but a guy can dream, right?Anonymous
February 13, 2010
"Using the magic return value of null to indicate EOF is unclear because it requires consumers to have knowledge of that return value." For better or worse, we're stuck with that. Many APIs have no way to even know whether they've reached the end-of-input without a read operation. For example, network sockets. Your code may have read all available data without reaching the end-of-input and be sitting there with another blocking read. And that next blocking read could be the one where end-of-input is reported, much later (i.e. at the time all the data was consumed, it wasn't yet known that the end-of-input had been reached). Sure, you could refactor the code so that it could (for example) handle a zero-byte input properly before going back and checking the special "end-of-stream" property. But that's a lot of overhead for little practical benefit. I suppose if we could design a complete computing environment from the ground up, it could be designed such that input streams always have a known end-of-input that can be identified without trying to read more input. But a) it's not clear that such an environment would in fact be a practical improvement on the current situation, and b) obviously it's simply not practical to do that anyway. New computing systems have to be able to operate with existing ones. And even if we did somehow overcome all those practical obstacles, we'd still be stuck with the fact that not ALL uses of assignments as expressions with value fall into that category. Even if you could get rid of the end-of-input-as-part-of-a-read scenario, we'd still have places where it would be useful to evaluate the outcome of an assignment operation.Anonymous
February 13, 2010
When I'm forced to switch on the type of a value, I use the following pattern: private static IPropertyWriter GetPropertyWriter(Object value) { String str; IEnumerable enumerable; Pair pair; Triplet triplet; if ((str = value as String) != null) { return new StringWriter(str); } else if ((enumerable = value as IEnumerable) != null) { return new EnumerableWriter(enumerable); } else if ((pair = value as Pair) != null) { return new PairWriter(pair); } else if ((triplet = value as Triplet) != null) { return new TripletWriter(triplet); } else { return null; } }Anonymous
February 13, 2010
"When I'm forced to switch on the type of a value, I use the following pattern:" Taking as granted that you might indeed be forced into that kind of logic (I would say that generally one should try to avoid having to have conditions that depend on the specific type), it seems to me that a more maintainable approach would be to set up a data-driven framework to do that kind of work. For example: static IPropertyWriter GetPropertyWriter(Object value) { return GetMappedObject(value, _rgmm); } static MethodMap<IPropertyWriter>[] _rgmm = new MethodMap<IPropertyWriter>[] { new MethodMap<IPropertyWriter>(typeof(string), (obj) => new StringWriter((String)obj)), new MethodMap<IPropertyWriter>(typeof(IEnumerable), (obj) => new EnumerableWriter((IEnumerable)obj)), new MethodMap<IPropertyWriter>(typeof(Pair), (obj) => new PairWriter((Pair)obj)), new MethodMap<IPropertyWriter>(typeof(Triplet), (obj) => new TripletWriter((Triplet)obj)) }; struct MethodMap<T> { public readonly Type Type; public readonly Func<object, T> Mapper; public MethodMap(Type type, Func<object, T> mapper) { Type = type; Mapper = mapper; } } static T GetMappedObject<T>(object value, MethodMap<T>[] rgmm) { foreach (MethodMap<T> mm in rgmm) { if (mm.Type.IsInstanceOfType(value)) { return mm.Mapper(value); } } return default(T); } Once you've got the basic boilerplate above in place, it's a lot simpler to add that kind of mapping than to have to keep writing a bunch of if/else chains. Just create/add a new element to an array that describes the relationship.Anonymous
February 13, 2010
The comment has been removedAnonymous
February 13, 2010
I use assignments as expressions often, for lazy loading of readonly properties: private MyObject myObject = null; public MyObject MyObject { return myObject ?? (myObject = Repository.GetMyObject()); }Anonymous
February 13, 2010
@pete.d: Indeed, I've used a Dictionary<Type, Func<Object, ...>> in other scenarios. But a map seems like overkill when the number of types is very small.Anonymous
February 13, 2010
Re: map vs if Well, you couldn't use a dictionary in your example. Your code prioritizes the types, and includes a type that can't actually be used in a search (IEnumerable). That's why my example uses an array instead (initialized in priority order). But note that in cases where a dictionary would apply, you could also write that as a switch (for types, you'd have to switch on .GetType().FullName). And if you do that, the compiler's just going to convert that to a dictionary anyway, if you have more than six choices (for C# 3.0 anyway). That said, if you really love the if/else if pattern, it works fine and I certainly wouldn't argue that there's anything wrong with it per se. And even though these days, in those kinds of situations I do in fact write the assignment as a separate statement, I've in the past been known to include it in the if statement itself. Seems like a fine use of that construction, if that's what you prefer.Anonymous
February 14, 2010
Re: "do you use assignments as expressions?" As others have mentioned, I use using statements with the result of an initialisation as the paramter (which I suspect is something different to an assignment, strictly speaking.) I would use the while ((s = Read()) != null) pattern. but I don't really like the look of it with all those brackets and = operators where == operators normally go. So I use the while (true) { s = Read(); if (s == null) break; /.../} pattern instead. I wouldn't mind some syntactic sugar for that while loop to look nicer. Does C#3 or 4 provide anything to help out there? (Something like an anonymous function that returns something foreach can use.)Anonymous
February 14, 2010
As other commentors have done, I've written the following code for Streams: static byte[] BufferedReadAll(Stream stream) { byte[] buffer = new byte[16*1024]; using (MemoryStream ms = new MemoryStream()) { int bytesRead; while ((bytesRead = stream.Read(buffer, 0, buffer.Length)) != 0) { ms.Write(buffer, 0, bytesRead); } return ms.ToArray(); } } and for ASP.NET MVC, I've seen this for alternate rows: public static void AlternateRows<T>(this IEnumerable<T> dataSource, Action<T, bool> func) { if (dataSource == null) return; bool rowState = true; foreach (var item in dataSource) func(item, (rowState = !rowState)); }Anonymous
February 15, 2010
Another possible use of the assignment operator: A a = ...; B b; C c; D d; if ((b = a.GetB()) != null && (c = b.GetC()) != null && (d = c.GetD()) != null && d.IsE) { // Do something if a.GetB().GetC().GetD().IsE is true... } else { // Do something else... }Anonymous
February 15, 2010
The comment has been removedAnonymous
February 17, 2010
The use of such constructs is simply out of laziness...and most programmers seem to be inherently lazy. Why be clear in your code when you can be quick, especially when your boss wants you to get everything out yesterday?Anonymous
February 17, 2010
The comment has been removedAnonymous
February 17, 2010
Common use of assignment in expression (particular case is C++, but has C# analogues): if (FAILED(hr = CoDoSomething())) { LogError(hr); return; } But, is the result of the assignment expression actually a value, or actually a reference to the left-hand side variable? In C++, it is the latter (barring really unusual override of operator=), so you can, for example, initialize a pointer with its address. I don't use pointers in C# that much, maybe the syntax prevents any case where you would actually be able to discern the difference between value and reference to assigned variable.Anonymous
February 24, 2010
How about a wording like this: "the result of the simple assignment operator is the resultant value that was assigned to the left-hand side." This would handle the case where a property getter returns a different value to what was passed to the setter.Anonymous
February 27, 2010
>> The result of the simple assignment operator is not the value of the right hand side: >> const int x = 10; >> short y; >> object z; >> z = y = x; >> System.Console.WriteLine(z.GetType().ToString()); You have missed one important point, if you read the language specification. The type of expression y = x is short, not int. Because here compiler put in an implicit cast. The effective statement is z = y = (short)x; So it is always correct that "results in the value of the right-hand side”. Never half! HTHAnonymous
March 01, 2010
Eric, Here's one situation were I routinely use assignments as expressions (one of the only places that I feel it's less messy than the alternatives): Dictionary<string, HashSet<Member>> _blacklistedMembersByPlace = new Dictionary<string, HashSet<Member>>(); public bool BlacklistMember(string place, Member member) { HashSet<Member> members; if (!_blacklistedMembersByPlace.TryGetValue(place, out members)) _blacklistedMembersByPlace.Add(place, members = new HashSet<Member>()); return members.Add(members); }Anonymous
April 12, 2010
The only place I've ever used assignments as expressions in C# (besides chaining) is when matching a single input string against several possible regular expressions. My code will then generally look something like this: Match m; if ((m = Regex.Match(...)).Success) { // Process, using m.Groups } else if ((m = Regex.Match(...)).Success) { // Process, using m.Groups } The important thing to note here is that the code inside the if needs access to m.Groups, otherwise I could just use Regex.IsMatch(). Of course I realise that I could write it like this: Match m = Regex.Match(...); if (m.Success) { // Process, using m.Groups } else { m = Regex.Match(...); if (m.Success) { // Process, using m.Groups } } but once you have three or four regular expressions, the first one starts looking quite a lot cleaner.Anonymous
August 19, 2010
The comment has been removedAnonymous
December 31, 2010
U asked about real usage? is this is still meanfull here it is: Most common usage is clearing some data in class or form, something like this: a) this.a = this.b = this.c = 0; b) ed1.Text = ed2.Text = ed3.Text = string.Empty; so, most common usage is data clear, when recreation of actual object is very slow task (complex form, f.e.). Happy New Year!