Bad Metaphors

The standard way to teach beginner OO programmers about classes is to make a metaphor to the real world. And indeed, I do this all the time in this blog, usually to the animal kingdom. A "class" in real life codifies a commonality amongst a certain set of objects: mammals, for example, have many things in common; they have backbones, can grow hair, can make their own heat, and so on. A class in a programming language does the same thing: codifies a commonality amongst a certain set of objects via the mechanism of inheritance. Inheritance ensures commonalities because, as we've already discussed, "inheritance" by definition means "all (*) the members of the base type are also members of the derived type".

Inheritance relationships amongst classes (**) are usually designed to model "is a special kind of" relationships. A giraffe is a special kind of mammal, so the class Giraffe inherits from the class Mammal, which in turn inherits from Animal, which inherits from Object. And that's great; this clearly represents "is a special kind of" relationships. I have always, however, had a problem with the fundamental metaphor of "inheritance". Why "inheritance"? You inherit genetic information, property, and if you're a titular lord, your peerage, from your parents. And if you make a diagram of a class hierarchy, it looks a bit like a "family tree" in which the derived class is the "child" of the base "parent" class. And indeed, people often speak of the base class as the "parent" class of a "child" derived class, particularly when speaking to beginners.

But the "parent-to-child inheritance" metaphor is awful. A giraffe is not "a child of mammal"; a giraffe is a child of Mr. and Mrs. Giraffe. A "child" is not "a special kind of parent". In reality, you only inherit half your genetic makeup from each parent, and you can inherit real property from any relation, or for that matter, from any non-relation. In programming languages you only "inherit" from related types, and you inherit all their members (*). In reality, everyone has two parents (***), but in programming languages some languages allow inheritance from arbitrarily many "parents", some allow exactly one. In reality, a single, specific person inherits specific property from a single, specific parent, and two different children can have entirely different inheritances from their parent; in programming languages, the "inheritance" relationship does not apply to individual objects, and every child inherits exactly the same thing from the parents. And in reality you only inherit real property when the decedent is dead!

But wait, it gets worse. The parent-child metaphor is ambiguous in any language that supports both lexical nesting and nominal subtyping of classes:

class B<T>
{
class D<U> : B<U> { }
}

Quick, what type is the "parent" of type B<string>.D<int>? Is it B<T> or B<string> or B<int>? That type is lexically inside B<T>, logically inside of B<string>, and derived from B<int>; which of those three is its parent? If you drew a graph showing either lexical or logical containment relationships, it would form a graph that looks every bit as much like a "family tree" as the graph showing inheritance relationships. And lexical containment allows access to all the properties of the container from the contained type, even including not-inherited and normally inaccesible members like private constructors! It is not at all clear that one kind of "parentage" is actually more "parent-like" than any other.

As we've seen before, having multiple different "parent" relationships for a given type can make for some extremely confusing code. We have to be extraordinarily careful when writing the specification and the compiler to ensure that we unambiguously describe precisely the relationship we wish to describe. I therefore try hard to avoid "parent-child" metaphors entirely; it is much more clear when writing an example to describe the type relationship as "base type and derived type", rather than "parent type and child type".


(*) Excepting constructors and destructors.

(**) I'm going to stick to talking about class-based inheritance here; my criticisms apply equally well to interface-based inheritance but I don't want to open the can of worms that is all the subtle differences between class and interface inheritance. And I've never much liked the inheritance metaphor on interfaces anyways; a "contractual obligation" metaphor is better.

(***) Assuming that we're talking about members of a sexually reproducing species.

Comments

  • Anonymous
    February 13, 2012
    Typo? "all (*) the members of the base type are also members of the derived type" A giraffe (derived) is a mammal (base), but a mammel is not (necessarily) a giraffe.

  • Anonymous
    February 13, 2012
    Re: you ** comment on inheritance from interfaces, this is a pet hate of mine, a class implements the interface, it does not inherit from it. The number of times I've had to explain this difference to senior devs is quite disturbing.

  • Anonymous
    February 13, 2012
    I think programming would be easier if the terms were all made up words - but still sounded like proper words. Having learned programming very early before I knew most of the more advanced English words, to me it seems inheritance is the perfect word for base/derived class relationships, but no so fitting for real life usage for exactly these reasons. Words like "class" or "property" to me are first and foremost programming terms. Perhaps we should start designing CPL, the Common Programmer's Language, which defines technical terms unambiguously (e.g. doesn't have the word 'dynamic'), and has a term for every single concept used by any programmer ever, so that we would be able to actually speak a language without confusing metaphors or ambiguities? Nah... Nobody would use it.

  • Anonymous
    February 13, 2012
    Actually the term 'inheritance' makes good sense, only it applies to types and not to objects! The derived <i>type</i> B inherits all members (*) from its base type(s) A<sub>i</sub> just as children inherit their properties from their parents, and each inherited member is inherited from exactly one parent (except weird cases like virtual inheritance, where the member is inherited from one grandparent through one or more parents). Thus IMO the ultimate cause of problems in this particular case is confusion about the type-object distinction.

  • Anonymous
    February 13, 2012
    @Kyle members meaning methods, properties, etc.  You are confusing with instances of the types.

  • Anonymous
    February 13, 2012
    @Carl Sixsmith but interfaces inherit from other interfaces (that's the term used in the C# specification).

  • Anonymous
    February 13, 2012
    While this is somewhat orthogonal to your discussion, I've always loved this essay by Kragen, which shows exactly how terrible "Giraffe extends Animal" can be. lists.canonical.org/.../000937.html

  • Anonymous
    February 13, 2012
    "Quick, what type is the "parent" of type B<string>.D<int>" I don't know.  Isn't that a good reason not to use this kind of design?  Even with really good names for B and D, I think it would be difficult for me to grasp what this relationship defines.  Sort of like your "Smart Pointers Are Too Smart" example.   (blogs.msdn.com/.../53016.aspx)

  • Anonymous
    February 13, 2012
    Usage of mammals in the example may the problem here. Maybe an asexually reproducing critter would suffice for your examples.

  • Anonymous
    February 13, 2012
    @phoog: Yes, that makes more sense then.

  • Anonymous
    February 13, 2012
    @Jasper And then, Circle extends Ellipse has a completely different problem. (Less so for immutable types, but still a problem)

  • Anonymous
    February 13, 2012
    Perhaps "parent" and "child" don't refer directly to family relationships, but are themselves computer science jargon for nodes in a tree structure. So the parent class is adjacent to the child class in the class heirarchy (tree) but it is the one closer to the root of the tree. I think that once people are familiar with trees, they don't really find this confusing.

  • Anonymous
    February 13, 2012
    While I have no problem with using less loaded/ambiguous terms than "parent-child", be wary of trying too hard to be correct when explaining a new concept: it's far easier to correct misunderstandings about specific behavior than to correct confusion/disgust over a "far too complicated" feature. When first introducing OO (really class-based, but lets not go there!), the essential detail that you need to get across, ignoring everything about members, behavior, etc..., is the "is-a" relation. Using real-world examples that the student is immediately going to map to such relations, despite the fact that they don't actual match Liskov or would be data-driven in a real implementation, is far more helpful than dryly describing the properties of "is-a". When the smart student then says later "A-ha! But penguins can't fly!", the correct response is "the real world is complicated", or "So it's probably a bad idea to actually have a 'Bird' base class", not to try to re-explain OO from scratch again!

  • Anonymous
    February 13, 2012
    @Carl: Agreed. I've seen even teachers don't understand "interface" is just a "contract" declaring what "services" you need to implement and nothing else. What's more? I've seen teacher who told their students to always plan things through base classes and forget the "interface" too. And people wonder why others think OOP is difficult! Of course that's 10+ years ago and let's hope things would be better now.

  • Anonymous
    February 13, 2012
    But this mammal and giraffe metaphor is worst than most for another reason. Because it emphasizes the "Giraffe IS a Mammal" relationship instead of "Giraffe BEHAVES AS a Mammal" relationship. This is important when you reach the classical corner cases. While in classic taxonomy it makes sense for whales being mammals, in an OOP world there is no advantage in having whales are mammals instead of fishes (besides, mammal is a completely meaningless category in OOP - unless you consider Sweat() a method you can call :)). Also, classifying objects for Behaviour has the advantage of making structural patterns (adapter, composite and above all decorator) much easier to introduce.

  • Anonymous
    February 13, 2012
    You might enjoy Derek Rayside's paper on Aristotle and Object-Oriented Programming.

  • Anonymous
    February 13, 2012
    "... In reality, you only inherit half your genetic makeup from each parent, ..." Not if you are a bacteria. Perhaps we should downgrade the metaphor.

  • Anonymous
    February 13, 2012
    Actually I think using animals at all is a much worse sin than whether you talk about child/parent etc.. Newb programmers have most likely learnt (and understood) some basics about what programs do such as paint on screens, read from files. Then "trainers" and OO intro tutorials start telling tham about animals and shapes.. and their confidence is destroyed becuase they haven't a clue what on earth the point of an animal class is in the context of code!! Worse still the same trainers use large unhelpful headings such as Polymorhism, Encapsulation and yes I agree Eric even the term inheritence isn't intuitive enough, at least though it's a term they have heard of before and it's that scary! Time and time again new programmers at my company who have done a computer science degree just don't really get OO becuase everything they were taught was so abstract. i use a file importer program to demo OO and it makes sense to them. it has a concrete real life feel to it. It's a complex enough problem to justify OO but not to complex that it blows their mind.

  • Anonymous
    February 14, 2012
    Thought provoking article as always, Eric.   Great comments, also. I agree with Mike G. - I don't think the animal metaphor is very good for beginners: I know it confused me in the beginning because it just did not seem to relate to programming. I don't mind the parent/child terminology, because to me it reflects the tree heirarchy well and the direction of the heirarchy.  For some reason, the terms "less derived" and "more derived' are just not intuitive to me and I end up translating them to "is a parent" and "is a child" in my mind. I don't agree with the rant about interfaces being implemented instead of inherited.  I like the word inherited for both the base class and the interfaces.  The child class is inheriting the method signatures for both base classes and interfaces and inherits the type of both the base classes and interfaces. The implementation to me is separate concept.  If the base is completely abastract and virtual the the child class must implement all the methods.  If a class already has all the methods in an interface then it does not have to implement anything.

  • Anonymous
    February 14, 2012
    I think it is not the case, that you can't map mammal as parent of giraffe. Real problem is that newbs can't map abstract things which are present in programming environment to such relations as in animal kingdom. Ok, mammal is quite abstract, but we have some intuition about those "animal" things. For those programming things we do not have any intuition at the beginning because there are much more abstract things to grasp, so we should learn by those totally abstract concepts which are present in real software.

  • Anonymous
    February 14, 2012
    Eric, Your latest blog posts (and possibly earlier too, haven't read enough) could easily make a beautiful book on software. Have you considered making them a book?

  • Anonymous
    February 14, 2012
    Did you see INCEPTION before explaining inheritence.... you shouldn't. Take an orange juice and write this article again.

  • Anonymous
    February 15, 2012
    What I like about the animal metaphor is that it is a hierarchy that everybody can instantly understand. If one is trying to come up with an example to relate to a situation, it is trivial to come up with more examples (two classes derived from the same base -- giraffe and zebra are both mammals; two classes derived from the same base but with different intermediate derivations -- cat and shark are both animals, but cat is a mammal and shark is a fish). If animals aren't a good metaphor, I'd like to know what to use instead.

  • Anonymous
    February 15, 2012
    @Gabe: I think the point is not to use an abstract metaphor at all but to use actual real world code examples instead.  A few commentors gave examples:  file importing and graphical example.

  • Anonymous
    February 15, 2012
    The comment has been removed

  • Anonymous
    February 15, 2012
    The comment has been removed

  • Anonymous
    February 15, 2012
    The comment has been removed

  • Anonymous
    February 15, 2012
    The comment has been removed

  • Anonymous
    February 16, 2012
    The comment has been removed

  • Anonymous
    February 17, 2012
    While I am a technologist by trade, my biggest takeaway from this article is that I spent 11 minutes learning all about the Right Honorable Sir Nigel Tufnel.

  • Anonymous
    February 18, 2012
    IMHO, implementation inheritance should almost never be used by the programmer to create a complex set of related classes - and implementation/is-a inheritance is all we get in C# and most other languages, as they lack any convenient constructs for delegation, mixins, etc. Inheritance should (and is) mainly used as a way to make a complex API more approachable by segmenting it, and enables such nicities as Intellisense - so basically it's what people use to leverage frameworks that are large and (hopefully) well-thought-out designs that the user accesses though the OO paradigm. Designing OOP (other than flat classes which organize an API or extend a framework class to gain behavior) should not be an everyday task for the programmer. As OOP is just a shim (v-tables, etc.) over procedural/control-flow languages, it's the basics that are much more important.

  • Anonymous
    February 20, 2012
    I've always felt that OO was a "solution" in search of a problem.  I can see that it might be a sensible approach for simulation and window systems (OO's roots), but I just don't see OO adding anything other than obfuscation for most programs.  Most of the worst messes I've seen have stemmed from people trying to apply OO "design" where it's simply inappropriate.  And don't get me started on patterns... Declarative languages get by just fine (let's be honest here, they get by better) with just algebraic data types and type classes (interfaces done properly).

  • Anonymous
    February 20, 2012
    "A little inaccuracy sometimes saves a ton of explanations." -- Anon.

  • Anonymous
    February 21, 2012
    Einar W. Høst 14 Feb 2012 1:29 AM "You might enjoy Derek Rayside's paper on Aristotle and Object-Oriented Programming." I found that at <a href="www.sciweavers.org/.../aristotle-and-object-oriented-programming-why-modern-students-need-traditional-logic">Aristotle and Object-Oriented Programming</a> on sciweavers.org. Interesting paper.  Thanks for pointing at it.

  • Anonymous
    February 21, 2012
    Well that didn't work as expected. Here's just the link text: www.sciweavers.org/.../aristotle-and-object-oriented-programming-why-modern-students-need-traditional-logic

  • Anonymous
    March 12, 2012
    As mentioned, understanding the "parent-child" relationship works once you understand trees, but everyone has two biological parents, not just one.  The employee-manager relationship might be better, but then... all analogies are flawed. As my Dad says, "Don't tell me what something is LIKE, tell me what it IS".   My saying is "Analogies are like feathers on a snake.  (Useless, unhelpful...)"  It's an anti-analogy analogy. Although I realize that analogies and metaphors are useful teaching tools, but they only go so far.  

  • Anonymous
    May 20, 2012
    And even "is a special kind of" doesn't always work. I think everyone who tried to declare class Square : Rectangle { /* ... */ } sooner or later discovered rather grave problems with such an approach. In fact, I saw in one editions of Stroustroupe's "C++ Language" his statement that "You should neither derive Square from Rectangle nor Rectangle from Square though the latter has some benefits". There is a conception of "Specialization by Constraint" which allows easily express Square as a descendant of Rectangle, and in fact can even be made compatible with Liskov's Substitution Principle — but only if you stop using pointers/references, and work with  value types only. Which is, of course, not very great, after all, pointers/references is one of the greatest inventions in CS.