Why Can't I Access A Protected Member From A Derived Class, Part Two: Why Can I?
This is a follow-up to my 2005 post on the same subject which I believe sets a personal record for the longest time between parts of a series. (Of course, I didn't know it was a series when I started it.) Please read the previous article in this series, as this post assumes knowledge of part one.
.......
OK, now that you've read that, it's clear why you can only access a protected member from an instance of an object known to be of a type at least as derived as the current context. You can therefore deduce the answer to the question asked to me by a (very polite) reader this morning: Why did this code compile in C# 2.0 but give an error in C# 3.0?
public abstract class Item{
private Item _parent;
public Item Parent {
get { return _parent; }
protected set { _parent = value; }
}
}
public class Bag:Item{
private List<Item> list = new List<Item>();
public void Add(Item item)
{
item.Parent = this; // Error in C# 3.0
list.Add(item);
}
}
public class Torch : Item { }
It compiled in C# 2.0 because the compiler had a bug. We forgot to enforce the semantics of "protected" access for the property setter. Though the compiler did not generate an error, it did generate code which would not pass the CLR verification check! The code would run if it happened to be fully trusted, but it was not safe to do so. We fixed the bug in C# 3.0 and took the breaking change.
This raises a more interesting point though. Now that we correctly prevent you from setting the "parent" reference in this manner, how would you implement the desired pattern? The reader wanted the following conditions to be met:
- There are two kinds of items -- "leaf" items, and "container" items. Container items may contain either kind of item, so you could have a box containing a bag, which in turn contains a torch.
- An item may be in a container, and if it is, its parent reference refers to that container.
- These classes must be extensible by arbitrary third parties.
- "Unauthorized" code must not be able to muck around with the parenting invariants.
The reader was attempting to enforce these invariants by making the parent setter protected. But even in a world where it is legal to access a protected member from an arbitrary derived class, that does not ensure that the invariants are maintained! If you allow any derived class to muck with the parenting, then you are relying upon every third-party derived class to "play nicely" and maintain your invariant. If any of them are buggy or hostile, then who knows what can happen?
When I'm faced with this kind of problem, I try to go back to first principles. Considering each design constraint leads us to an implementation decision which implements that constraint:
- There are two kinds of abstract items -- items and containers. Therefore, there should be two abstract base classes, not just one.
- Containers are items. Therefore, the container class should derive from the item class.
- The parenting invariant must be maintained across all containers. Therefore the invariant implementation must be inside the abstract container base class.
- Every possible container requires "write" access to the parent state of every possible Item. We want that access to be as restricted as possible. Ideally, we want it to be private. The only by-design way to get access to a private is to be inside the class.
And now a full solution becomes very straightforward:
using System;
using System.Collections.Generic;
public abstract class Item {
public Item Parent { get; private set; }
public abstract class Container : Item {
private HashSet<Item> items = new HashSet<Item>();
public void Add(Item item) {
if (item.Parent != null)
throw new Exception("Item has inconsistent containment.");
item.Parent = this;
items.Add(item);
}
public void Remove(Item item) {
if (!Contains(item))
throw new Exception("Container does not contain that item.");
items.Remove(item);
item.Parent = null;
}
public bool Contains(Item item) {
return items.Contains(item);
}
public IEnumerable<Item> Items {
// Do not just return items. Then the caller could cast it
// to HashSet<Item> and then make modifications to your
// internal state! Return a read-only sequence:
get {
foreach(Item item in items) yield return item;
}
}
}
}
// These can be in third-party assemblies:
public class Bag : Item.Container { }
public class Box : Item.Container { }
public class Torch : Item { }
public class TreasureMap : Item { }
public class Program {
public static void Main() {
var map = new TreasureMap();
var box = new Box();
box.Add(map);
var bag = new Bag();
bag.Add(box);
foreach(Item item in bag.Items)
Console.WriteLine(item);
}
}
Pretty slick eh?
A couple questions for you to ponder:
1) Suppose you were a hostile third party and you wanted to mess up the parenting invariant. Clearly, if you are sufficiently trusted, you can always use private reflection or unsafe code to muck around with the state directly, so that's not a very interesting attack. Any other bright ideas come to mind for ways that this code is vulnerable to tampering?
2) Suppose you wanted to make this hierarchy an immutable collection, where "Add" and "Remove" returned new collections rather than mutating the existing collection. How would you represent the parenting relationship?
Next time, another oddity involving "protected" semantics. Have a good weekend!
Comments
Anonymous
March 28, 2008
Comme l'explique Eric Lippert dans son dernier post , en C# 3.0, le code suivant ne compilera pas : publicAnonymous
March 28, 2008
one way you can mess with things is to use a badly behaved GetHashCode() and/or Equals function in a new Item implementation. This would exploit the HashSet guts of the collection you could add something to one collection, then remove it but exploit knowledge of the number of calls to GetHashCode() and/or Equals that would be used to do: firstContainer.Add(nastyFoo) nastyFoo.ActivateChangeAssumingRemoveSemantics(); firstContainer.Remove(nastyFoo); // if (!Contains(item)) => this underlying call to GetHashCode and Equals should behave normally // items.Remove(item); => this underlying call should not, change hashcode and or Equals // item.Parent = null; => this neatly breaks the Parent invariant nastyFoo.RestoreOriginalSemantics(); secondContainer.Add(nastyFoo); firstContainer.Contains(nastyFoo) && secondContainer.Contains(nastyFoo);; // true!Anonymous
March 28, 2008
I should point out that you can guard against it by forcing the GetHashCode() and Equals() implementations in the two abstract base classes respectively, object equality is probably sufficient for this example so public override sealed GetHashCode() { return base.GetHashCode(); } public override sealed Equals(object o) { return base.Equals(o); } (plus an IEquatable<Item> implementation if you wanted)Anonymous
March 28, 2008
The comment has been removedAnonymous
March 28, 2008
I'm still waiting for compiler-enforced immutable objects... ... and a declarative (e.g. Attribute-based) way to implement GetHashCode and Equals and similar methods - i.e. have non-virtual implementations of these methods and Ignore-Attributes on fields or something alike, and implemented in a performant way (e.g. auto-generated code).Anonymous
March 29, 2008
The idea of having "hostile third parties", who are needed to be fighted against, is at least strange (if speaking politely). You are abusing access modifiers yourself, when you use them as a firewall agains malicious users. The intended usage of access modifiers is documenting your commitments about future API changes. They have one advantage against plain comments - the compliance to them can be checked automatically by compilers. And like for any other documentation, the 80/20 rule applies to them. That is, if you can't document your commitment using access modifiers in your favorite programming language, just write a comment. Unfortunately, many devs think, access modifiers are a tool to protect "their" code from unintendent usage. This is very ridiculous, because of course you can never be made responsible for any problems when your code is used in an unappropriate manner. So why bother? (If you ARE being made responsible for such kind of things, change your employer).Anonymous
March 29, 2008
The comment has been removedAnonymous
March 29, 2008
Sorry, but as opposed to most things you write on your blog I don't like the solution you presented, Eric. My major concern is: What happens if your class hierarchy gets bigger - that means many things derived from Items with individual behaviour etc. - and you still have to represent this hierarchy in one class. Not really nice imo. C# is missing something like a 'friend' keyword in C++ here. The only approach that I can think of is to make the Parent internal instead of protected. public abstract class Item{ private Item _parent; public Item Parent { get { return _parent; } protected set { _parent = value; } } } But ... exactly like in your previous example replacing 'protected' with "protected internal' compiles at least with my VS 2008. This can't be the other "oddity involving "protected" semantics" you mentioned, can it?Anonymous
March 29, 2008
The comment has been removedAnonymous
March 29, 2008
The comment has been removedAnonymous
March 29, 2008
The comment has been removedAnonymous
March 30, 2008
Matt, CLR itself is not protected. Take Mono, change it to ignore protection levels, run your "protected" software and do with it anything you want. It is a silly idea trying to achieve any security with software being executed by a hacker. In the enviroment of the hacker. Even native software (written on C or C++) is being easily hacked using IDA or some virtualization tools. CLR made things simpler, not harder for hackers. This is the reality how I can see it, not some abstract specification. Security happens only on the server side today (server side in broader sence, i.e. an interner browser is a "server" from the point of view of a malicious web site).Anonymous
March 30, 2008
The comment has been removedAnonymous
March 30, 2008
The comment has been removedAnonymous
March 30, 2008
LINQ in Portuguese ( Direct ) http://www.linqpad.net Eric Lippert Why Can't I Access A Protected MemberAnonymous
March 30, 2008
The comment has been removedAnonymous
March 30, 2008
The comment has been removedAnonymous
March 30, 2008
Welcome to the forty-second issue of Community Convergence. The last few weeks have been a busy timeAnonymous
March 31, 2008
The comment has been removedAnonymous
March 31, 2008
Generating a compiler error in that case is counter-intuitive to me. If that's the way it is then that's the way it is (and I had to verify it in VS 2008 because it definitely surprised me at first); but it certainly violates my principle of least surprise, being a long time C++ programmer, and having used C# professionally for the last six years. A protected member SHOULD be available to any arbitrary derived class; if you don't want an arbitrary derived class accessing a protected member, then you shouldn't make it protected in the first place. That seems pretty straightforward to me. Not sure why this scenario needed to be special cased.Anonymous
March 31, 2008
First off, C++ has the same rule. (The statement of the rule in C++ is "A friend or a member function of a derived class can access a protected nonstatic member of one of its base clases only through a pointer to, reference to, or object of the derived class or any class derived from that class.") If you find this surprising about C#, you should find it equally surprising about C++. Second, your statement is correct -- a protected member should be available to any arbitrary derived class. It is. A protected member is available to ANY DERIVED CLASS. That is the key point: DERIVED CLASS. Is Bag a derived class of Torch? No, it is not. Therefore, Bag does not get access to Torch's protected member.Anonymous
March 31, 2008
Josh: There are a number of ways to prevent cycles. Here are just two: Method #1) Suppose you have A<--B<--C, all containers. (The arrow is "contained in".) When you try to put A into C, check to see that A != C. If A is C, throw an exception. Then look up the parent list of C and see if it already contains A. If it does, throw an exception. Method #2) Right now the code only allows inserts when the inserted item is not contained in anything. You could change this to say "an item may be inserted in another only if both items presently have the same parent and the items are different" So, for example, suppose you have C, and A<--B. You want to put C into B, but A and C have different parents, so that is not legal. You have to first put C into A, because A and C have the same (null) parent. Then you can put C into B, because C and B now have the same parent. If you have A<--B<--C, you cannot put A into A, because they are the same. You cannot put A into B or A into C, because they all have different parents. The only way to do that is to remove B from A, then put A into B, then put A into C, which has no cycle.Anonymous
April 01, 2008
The comment has been removedAnonymous
April 06, 2008
The comment has been removedAnonymous
April 11, 2008
The comment has been removedAnonymous
April 11, 2008
Ack, the code didn't work correctly. What's the right way to preserve indentation?Anonymous
April 11, 2008
I fixed the indentation. Not sure how to do it from this interface. It seems to me that if you want A to have access to internals of more than one class B, C in your assembly, that's what the aptly named "internal" access modifier was invented for.Anonymous
April 11, 2008
The comment has been removedAnonymous
April 14, 2008
The comment has been removedAnonymous
April 15, 2008
I think that what I show next could be an elegant solution to the "friend class" design problem: namespace TwoMutuallyKnownClasses { public abstract class Item //: BaseClass, BaseInterface friends Bag //where T : SomeType { private Item _parent; public Item Parent { get { return _parent; } } // This method is accessible from derived classes (protected) and from friend classes (friends) protected friends void SetParent(Item parent) { _parent = parent; } } public abstract class Bag : Item { private List<Item> _list = new List<Item>(); public void Add(Item item) { item.SetParent(this); // Not an error in C# 4.0 _list.Add(item); } } } Something like this captures the special relationship between these two classes. The trust relationship that is expressed does not pervase to the containing assembly's types, by lack of expressiveness of the language. That would be the case if the «internal» keyword had to be used, in this place, to express the same real relationship. The number of types in an assembly can easily get very large, and it may not be feasible to assume that they all share the same level of trust, as the «internal» keyword obliges. Also it is common that several different people contribute to a given assembly, possibly at different points in time, with decreasing knowledge of the original author's intentions. Personally, I think code accessibility and intent expressiveness should not (need to) depend on assembly boundaries. Just like the «internal» keyword denotes all types of an assembly, the «friends» keyword would denote the set of types considered friends by a given class. Notice that the proposed expressiveness is somehow inverse of that of the C++ language, where a class «Item» says: "Method Bag.Add has access to me" or if Item trusts all of Bags methods: "Class Bag has access to me". Instead, in a CIL language (http://en.wikipedia.org/wiki/Common_Intermediate_Language) we would say: "My method SetParent may be accessed by (any method of) any of my friend classes" and that: "Class Bag is my friend" Would this be a good approach? Would it bring any problems with it?Anonymous
April 16, 2008
I noticed no one really addressed Eric's 2nd question about how to "make this hierarchy an immutable collection" I could not really think of any good way to make the hierarchy immutable and have items know their parents without having to rebuild the entire hierarchy every time something changes. The only thing I could come up with to represent this type of immutible hierarchy was to have the container hold its child items in a binary tree structure.Anonymous
April 22, 2008
The comment has been removedAnonymous
April 24, 2008
Holy goodness, I've been busy. The MVP Summit was fabulous for us; thanks to all who attended and gaveAnonymous
May 02, 2008
In Part Two I asked a couple of follow-up questions, the first of which was: Suppose you were a hostileAnonymous
December 28, 2008
The comment has been removedAnonymous
August 07, 2009
The comment has been removedAnonymous
October 19, 2010
I see potential threat of abusing this cross sibling calls to protected methods of the base class, but there's also similar risk of malicious calls from malicious classes that derive from one of the siblings. I mention calls to methods but it also covers accessing other things like properties. Here's my example - some software development company structure. Where only people of development background are allowed to give work to developers. Others have to go to the manager to get it vetted. public class Employee {} public class Developer : Employee() { protected void virtual Develop(Requirements data) { // develop } } public class CSharpDeveloper : Developer { protected void DoSomeCSharpStuff() {} override void Develop(Requirements data) { // implement all code changes, remove from 'data' and delegate rest to other devs } } public class DbDeveloper : Developer { protected void DoSomeDbStuff() {} override void Develop(Requirements data) { // implement all db changes, remove from 'data' and delegate rest to other devs } } public class DevelopmentManager : Developer, IAmManager { DbDeveloper [] dbDevs; CSharpDeveloper[] cDevs; override void Develop(Requirements data) { // are you kidding me?! find relevant developer and delegate work } IAmManager.RequestWork(Requirements data) { // assess whether relevant to the devs and allocate or bin it } } public class Tester : Employee {} So if sibling calls were allowed then DevelopmentManager would be able to call Develop() on devs in his Team. Developers would be protected from any calls from Testers, they will have to go through proper channels :-) Now someone could derive malicious class from Developer class and send malicious calls to Develop() method, which relies on access modifier and doesn't validate input as much. But preventing cross sibling calls doesn't stop malicious class deriving from CSharpDeveloper class and also invoking calls. 'Internal protected' access modifier doesn't solve the problem because it would limit this model to 1 assembly.Anonymous
October 19, 2010
I see potential threat of abusing this cross sibling calls to protected methods of the base class, but there's also similar risk of malicious calls from malicious classes that derive from one of the siblings. I mention calls to methods but it also covers accessing other things like properties. Here's my example - some software development company structure. Where only people of development background are allowed to give work to developers. Others have to go to the manager to get it vetted. public class Employee {} public class Developer : Employee() { protected void virtual Develop(Requirements data) { // develop } } public class CSharpDeveloper : Developer { protected void DoSomeCSharpStuff() {} override void Develop(Requirements data) { // implement all code changes, remove from 'data' and delegate rest to other devs } } public class DbDeveloper : Developer { protected void DoSomeDbStuff() {} override void Develop(Requirements data) { // implement all db changes, remove from 'data' and delegate rest to other devs } } public class DevelopmentManager : Developer, IAmManager { DbDeveloper [] dbDevs; CSharpDeveloper[] cDevs; override void Develop(Requirements data) { // are you kidding me?! find relevant developer and delegate work } IAmManager.RequestWork(Requirements data) { // assess whether relevant to the devs and allocate or bin it } } public class Tester : Employee {} So if sibling calls were allowed then DevelopmentManager would be able to call Develop() on devs in his Team. Developers would be protected from any calls from Testers, they will have to go through proper channels :-) Now someone could derive malicious class from Developer class and send malicious calls to Develop() method, which relies on access modifier and doesn't validate input as much. But preventing cross sibling calls doesn't stop malicious class deriving from CSharpDeveloper class and also invoking calls. 'Internal protected' access modifier doesn't solve the problem because it would limit this model to 1 assembly.