Security and Inheritance
I received an e-mail from a customer referencing this newsgroup post and asking two questions about virtual methods and inheritance:
1. Why does it work like this?
2. What's the 'security' implication?
Funnily enough, I just read Eric's post on a very similar topic, but I'm going to talk about it too because I'll (hopefully) end up with a slightly different explanation, and maybe you'll learn something else. You should definitely read Eric's post though (either before or after mine).
So let's say you have the following class declarations (there's a plain English description below in case you're not familiar with JScript or if you just find code hard to read in a blog ;-) ):
// Base class
class
Base
{
function
MainMethod()
{
print("Inside Base.MainMethod()")
// Call the helper function
Helper()
}
// Helper method that's not visible
// to the general public
protected
function
Helper()
{
print("Inside Base.Helper")
}
}
// Derived class
class
Derived extends Base
{
// Override Helper method
protected
function
Helper()
{
print("Inside Derived.Helper")
}
}
Basically you have a class named Base that has two methods, MainMethod and Helper. Users are going to call MainMethod, and it will call the non-public Helper method to do some of its work. Because Helper is marked as protected, only members of Base or it's derived types may call it -- the great unwashed masses cannot.
You then have a class Derived that (funnily enough) derives from Base and provides its own implementation of Helper.
Aside: One thing to note here is that in JScript .NET, all non-static methods are virtual and there's no way to declare a non-virtual method. C# and VB allow virtual methods, but they are not the default. We made them the default in JScript because the philosophy of the language was "people should be able to get stuff done," and in general if you write code like the code above the intent is to perform a method override, which requires using virtual methods. C# has a different philosophy, which is "developers should know what they're doing," and so they force developers to mark methods as virtual. I don't know what VB's philosophy on this matter is, but they also force you to explicitly declare a method as Overridable to make it virtual.
So anyway, you have these two classes, and then you write some code like this:
// Create instance of Base
var
b : Base = new Base
// Call MainMethod on the Base class
b.MainMethod()
print("---")
// Create instance of Derived
var
d : Derived = new Derived
// Call MainMethod on the Derived class
d.MainMethod()
print("---")
// Now assign the derived instance to the
// Base variable
b = d
// Call MainMethod on the Base class
b.MainMethod()
print("---")
// No, really, call it on the Base class!
Base(b).MainMethod()
And you get some output:
Inside Base.MainMethod()
Inside Base.Helper
So far so good; you had a Base and you got the MainMethod and the Helper
Inside Base.MainMethod()
Inside Derived.Helper
Still good; you had a Derived and you got the MainMethod and the Helper as expected
Inside Base.MainMethod()
Inside Derived.Helper
What's this? You had a reference typed as Base but you still got the Derived version of Helper
Inside Base.MainMethod()
Inside Derived.Helper
This is just to show you that extra casting is a waste of time ;-). The compiler already knows that b is a Base, so doing an explicit cast won't get you anywhere.
What happens is that when MainMethod goes to call Helper, it doesn't look directly at the function declared in Base and start executing it. (This would be the behaviour if Helper was non-virtual, but remember all methods are virtual in JScript). Instead what happens is that MainMethod consults a special lookup value that then tells it where to find the most derived implementation of the method, which in this case is Derived.Helper. There's no way to make an instance of Derived call Base's version of the Helper method, even if you call it through a reference of type Base or try and do a cast. It's just not possible.
Aside: Note that it is possible for a method to call its immediate base class' implementation via the super keyword (base in C# and MyBase in VB), but this only works from within the derived class itself, and only "up" one level -- you can't do this from outside the class. And although it will call the direct parent's method, any methods that it calls will be subject to the normal rules (ie, find the most derived implementation).
You probably knew that, so why is this a security concern? Let's assume that the Derived class performs some special access checks in the Helper method. Maybe this is where it does a username lookup, or an account balance check, or some other important action that must succeed before the rest of the MainMethod can be allowed to complete. Now if the rule above (you always call the most-derived implementation) were violated, it would be possible to break the system either accidentally or maliciously.
In the accidental case, you have an instance of Derived and you pass it to a method that only knows how to handle Base objects. Since Derived is a subclass of Base, the compiler has no problem with this and will happily pass off the object, and the method you call has no problem since it knows how to deal with instance of Base. But if it ever called into MainMethod then it, in turn, would call the Base implementation of Helper (bypassing the security checks) and you'd be in trouble. In the malicious case, you'd deliberately cast the reference to Base in order to circumvent the checks in Helper.
But whilst this behaviour is explicitly designed to help enforce security, it (ironically) presents a security problem of its own -- you can't really be sure what your virtual methods are getting up to. Let's say that instead of having the security checks be performed inside Derived's version of the Helper method, we have the security checks be made inside Base and have Derived's implementation simply return successfully without doing any checks at all. Now we have the same sequence of events -- you pass off an instance of Derived to a method that only knows how to deal with Base objects -- but now instead of guaranteeing that the security checks in Helper are made, it guarantees that the security checks are skipped!
The way to get around this little problem is to ensure that the method (or perhaps the entire class) is marked as final (aka NotOverridable in VB), which tells the compiler (and, more importantly, the CLR) that no class is allowed to provide a different implementation of this method. Or if there's no need for arbitrary classes to call the method at all, mark it as private or perhaps internal (Friend in VB). Don't let your security code be skipped by a pesky hacker deriving from your type and overriding your methods! (See this note for security concerns around internal virtual methods).
Aside: As a rule of thumb, if you don't explicitly intend for people to subclass your types, you should mark them as final as a matter of course. It's kind of the "least privilege" rule applied to software design -- if there's no good reason to enable a scenario, you should explicitly disable it. In fact I often wonder why the C# team didn't make final the default for classes, given their general approach to things, but perhaps that was just a bit too much for their customers to deal with. You do have to weigh the security risks of allowing people to subclass your types with the benefits you get from having a rich, easy-to-extend object model.
The two basic rules then are:
· If you have a base class and you expect derived classes to specialise a method, make sure it is virtual so the specialised method is always called (and remember that all methods are virtual in JScript)
· If you have a base class and you want to ensure that your method is never circumvented by a subclass, make sure it (or the entire class) is marked as final, or that it is not visible to the derived class, or (in languages that support it) it isn't virtual
As a final trick, let's see how you can keep the same function name but get out of the virtual-ness game. In JScript, you can use the hide modifier (new in C# or Shadows in VB) which says "even though this method has the same name as a method that I could override in the base class, I want you to give me a new lookup value so I don't clobber my base class' implementation."
Add the following declarations:
// Another derived class
class
MoreDerived extends Derived
{
// We'll make a *new* Helper!
hide function Helper()
{
print("Inside MoreDerived.Helper")
}
}
// Most derived class
class
MostDerived extends MoreDerived
Comments
- Anonymous
January 14, 2004
Wow, I had no idea JScript was a 'full' language in .NET... - Anonymous
January 14, 2004
All methods in Jscript.NET is virtual because that's how Java works. In Java, every method defined is by default a virtual method.
=>In fact I often wonder why the C# team didn't make final the default for classes, given their general approach to things, but perhaps that was just a bit too much for their customers to deal with.
Well, after reading this statement of yours, i got this question and was about to post it ! Then decided to read Eric's post you'd pointed to and he answers my question.
Q : If every method were final unless explicitly stated as virtual, how would method hiding work ?! Overriding can be specified but the New keyword would lose its meaning completely wouldn't ??
Eric's Answer in the post : Note that it is legal to hide a final base class method. It is only illegal to override a final base class method.
Perfect ! - Anonymous
January 15, 2004
Vijay, actually all methods are virtual in JScript because we didn't think it was worth adding in the extra complexity to the language. It may also have something to do with limitations in Reflection.Emit, although I doubt it and I I can't be bothered trying to find out. The behaviour of Java has little (if anything) to do with it.
Since you can mark methods as 'final' and/or subclasses can explicitly 'hide' virtual methods, there was no reason for non-virtual methods other than a negligible performance gain which wasn't a good enough argument to counter the "keep it simple" argument (if you can class [ha ha!] JScript .NET as "simple" :-) )
As for hiding -- the point of making an entire class 'final' is that it would then be impossible to derive from it, therefore you would not have any issues with trying to 'hide' methods (because you wouldn't have a subclass in which to hide them). - Anonymous
January 15, 2004
P.S. I've talked here about how virual methods work for "security" reasons, but really it's about "correctness." It's just easier to justify the behaviour if you bring security into the mix rather than just saying "your program might have bugs if we did it the other way."
Another potential blog is that all security bugs are really just "correctness" bugs, although the definition of "correct" changes over time. - Anonymous
January 16, 2004
Vijay - JScript != Java. Part of the name and a reasonably c-like syntax are about all they share. - Anonymous
January 17, 2004
Tim, i never said JScript == Java ! Did I ??
My understanding is that javascript has strong syntactic and architectural similarities to how Java was designed and so it was plain enough to assume that while creating a .NET language out of an existing one, you wouldn't differ a whole lot because that would confuse your existing JScript programmers ... Well anyway, not here for an argument but just conveying my views ! Thanks for clarifying things peter ! - Anonymous
January 25, 2004
Eric talks more about this in his blog:
http://weblogs.asp.net/ericlippert/archive/2004/01/22/61803.aspx - Anonymous
July 11, 2006
I think another issue with stacking up virtual methods is a huge performance hit.Not recommended really.