共用方式為


Misadventures in immutability

My infatuation

Starting I think a few years ago, I began to heavily drink the immutability kool-aid. I started getting oddly satisfied whenver I saw a class like this:

 public sealed class WeatherState
{
   private readonly float _temperature; // In Celsius of course
   private readonly float _windSpeed; // km/h
   ...
}

With this kind of class, if I have to write a SatisfactionMeter class that calculates anyone's happiness for this weather, I can simply write

 public sealed class SatisfactionMeter
{
   private readonly WeatherState _weather;

   public SatisfactionMeter(WeatherState weather)
   {
      _weather = weather;
   }
   ...
}

and not worry about having to clone the WeatherState input object in case someone decides the weather is hotter now in the middle of my calculation. If in this class I have this method:

    public int GetSatisfaction(Person subject)
   {
      return _allCritera
        .Sum(criterion => criterion.EvaluateFor(subject, _weather));
   }

I can speed up my calculation on my multi-core machine by simply adding an AsParallel in there (assuming Person is nicely immutable as well):

       return _allCritera
        .AsParallel()
        .Sum(criterion => criterion.EvaluateFor(subject, _weather));

and not worry if somewhere deep in the analysis code some clever code decides to do what-if analysis by changing the temperature up and down thus messing up other threads' calculations in weird and wonderful ways.

Anyway: people much smarter than me have extolled the virtues of immutability for a while now so go read their explanations if you want to learn more, but this is just to say that I was smitten and wanted this goodness whenever I can get it.

Love in C# land

While C# is no Haskell, a subset of said smart people mentioned above have been steadily making it much easier to reap the benefits of this style of programming:

  1. You can't mark a class as immutable, but you can mark fields as readonly and most (all?) primitive and basic library types are immutable so in practice I didn't miss that too much.
  2. The Immutable Collections library has been heavenly, and even though it's still a beta nuget package I've personally been using it in almost all my projects since I found it and have had no problems at all with it.
  3. Even auto-properties are going to get immutability-favoring features in C# 6, thus hopefully correcting my pet peeve with everyone using auto-properties with private setters as if that was an equivalent way of having readonly fields (it's not - methods in the class can still mutate those properties).
  4. And of course all the parallelism goodness that came after around .Net 3.5 (PLINQ, TPL, ...) has made it much more lucrative to program in an immutable style and very easily get the parallelism advantages out of it.

So overall my love affair with immutability has been mostly untroubled in C#, except for our ever-going fight with complex data structures...

Brewing troubles

My first seeds of doubt that maybe immutability is a bit of a phase for me, and like everyone going through a phase I'm annoying the people around me by taking it too far, were sown when I wrote a relatively big library/toolset that revolved around a complex data structure (a description of a test run, including what tests to run, how to set up their environment, what parameters they take, etc.). Since I started writing it mostly on my own I indulged in my new-found love and made it all immutable, which proved more complex than I thought. To illustrate the problems, let me take the weather example from above a little further. Let's say I want to write a weather predictor, and I realize that the easiest way to predict tomorrow's weather without those pesky weather models is to take today's weather and just add/subtract a few degrees. So to make the job of my predictor even easier, I decide to add a new method to the WeatherState class that just returns the same weather but with a different temperature:

   public Weather ChangeTemperature(int newTemperature)
  {
    return new Weather(newTemperature, _windSpeed);
  }

And right away there are already a few problems:

  1. The method name: If you're reading this method in an article about immutability, you may not have trouble recognizing that ChangeTemperature returns a new object with a new temperature. But if you're trying to use a new library and discover a method called ChangeTemperature on the object you're using, it's not a far stretch to think that this method actually changes the temperature in your object so you just use it like that (as if it's a setter). Which is exactly what happened when people started using my library: it became a fairly common bug for people to ignore the new returned value from the ChangeX() methods and just assume that the method changes the existing object. Now with the benefit of hindsight I don't think I would have named them ChangeX(), but to this day I don't have a great name pattern for this - maybe WithNewX? (e.g. WithNewTemperature()). Also in my defence even some base library methods suffer from this confusion: I'm sure many people fell into the trap of calling String.Replace() on a string thinking it replaced it in-place, instead of taking the return object as the new string.
  2. The method implementation: The implementation is a one-liner, but it's not exactly as simple as implementing a setter. You have to pass in all the members of the class to the new constructor except what you're changing, which is definitely more complex and error-prone than { _temperature = value; } (and it's the reason I have this method in the first place, instead of having client code just use the constructor directly to achieve the same effect). And it gets progressively worse as the class gets bigger.

Now, partially to address the second problem, I began to aggressively nest my classes to keep them from having too many members. So in this example let's say we wanted to add the wind direction to the weather state: instead of adding it as a new member, I would bundle it up with wind speed into a new WindVelocity class, and add that to the WeatherState class. Now the WeatherState class still only has two members (temperature and wind velocity), and everything is tidy and awesome.

Except of course that's when my real trouble began: let's say we want to predict tomorrow's weather as the same as today except with stronger wind. So I want to add a method to change wind speed. Now there are two ways I can do this:

  1. I can add a WeatherState.ChangeWindSpeed() method which creates a new velocity object with the new wind speed and returns the new weather state object or
  2. I can add a WindVelocity.ChangeSpeed() method and a WeatherState.ChangeWindVelocity() method, and leave it to the client code to first construct the new wind velocity object then use that to create the new weather state object.

Neither option is pretty. The first one means that my highest level class will have way too many methods, and is also a weird violation of abstraction levels (why should the weather state object understand the speed concept in the velocity object? it's an internal detail of the workings of velocity). The second one leads to pretty verbose client code:

   var tomorrowsWeather = todaysWeather.ChangeWindVelocity(
    todaysWeather.WindVelocity.ChangeSpeed(
      todaysWeather.WindVelocity.Speed + 5));

In the end I settled on the second option, mainly because I'm a sadistic monster who doesn't care about the suffering of the users of my library. But even my cruel heart was moved a little by their plight...

Uneasy truce

Annoyingly, I don't have any real conclusions to this post. The whole experience left me with a vague feeling that surely languages can and should evolve to improve the situation here: after all this is almost purely a syntactic complexity not a conceptual one, so the language should help me out here. But I have no idea how I would do it if I were a language designer, which I guess is why I'm not a language designer.

And until such time as languages make these problems moot - will I still make my data structures immutable? Probably. I still love all the goodness that comes with immutability, and am not willing to give that up. I'll try to name my methods better to avoid confusion, but honestly I still don't have good solutions to the problem of changing a deeply nested member. So if you have any thoughts I'd love to hear them.

Comments

  • Anonymous
    May 12, 2014
    F# records (which are immutable) have syntactic sugar for this code using the with keyword. For instance: let tomorrowsWeather = { todaysWeather with WindVelocity = { todaysWeather.WindVelocity with Speed = todaysWeather.WindVelocity.Speed + 5 } } (This syntax is based upon the construction syntax: let todaysWeather = { Temperature = x; WindVelocity = { ... } }) I point this out because A) it gives one language's answer to the vocabulary question you asked: WithSpeed(x) rather than ChangeSpeed(x) is a bit more obvious that it produces a new object, and B) even with syntactic sugar for construction and "reconstruction", F# takes your approach "2" at the syntax level, rather than supporting approach "1"... That is this doesn't compile/work: let tomorrowsWeather = { todaysWeather with WindVelocity.Speed = todaysWeather.WindVelocity.Speed + 5 } I find it useful when questions come up to see how other languages handle those questions. F# is particularly useful to look at for things like immutable objects and how they are currently handled elsewhere in the .NET universe.

  • Anonymous
    May 12, 2014
    I'd also recommend giving F# a try. Sounds like you are sold on all the core benefits of immutability, but the syntax is holding you back. I've been on a similar journey to you over the last few years, took the plunge into F# in production projects about 6 months ago and haven't looked back! Also, a note on people expecting mutation and ignoring the result: I found that a implementing these functions as extension methods made this a lot clearer.

  • Anonymous
    May 12, 2014
    What about deserializing WeatherState? I never got it right with standard XML serializer for my readonly structs.

  • Anonymous
    May 13, 2014
    The comment has been removed

  • Anonymous
    May 14, 2014
    I have been using ImmutableObjectGraph - lets you define your classes in T4 template files and you get all of the nicely immutable class definitions, along with builders, etc written for you. Made by another MS staff member who was behind the immutableCollections library IIRC. I have made a few changes to it for my own purposes but it got me 99% of the way there. Highly recommended! github.com/.../ImmutableObjectGraph

  • Anonymous
    May 26, 2014
    I know other people have mentioned this but I'll just add my weight to this - if you want to work with immutable data structures and write more functional code on the .NET framework, F# is the best choice, by far. Way, way better than C#, which is a fine language but makes you jump through hoops for language features like you are after, whereas they're intrinsic to F#. If you spend just a couple of weeks using F# to try what you are after you will see how much more effective it is. I spend a lot of time working with Java and C# developers who say the same thing i.e. they don't have time to learn F# - but it's generally nowhere near as hard as they think, and also IMHO short-sighted.