Software Contracts, Part 8: Annotations outside the compiler - runtime enforced annotations.
Ok, it's taken 7 other posts, but we've finally gotten close to where I wanted to be when I started this series.
Remember my definition of an annotation: An Annotation is an addition to the source code for a program that allows an external translator to enforce the program's contract.
My first examples of annotations are annotations that are enforced by the compiler - utilizing a language's type system. These annotations are highly useful because source code must be passed through the compiler before it's executed, thus allowing the compiler to correct the contract violation before it gets executed.
Sometimes the annotation (and thus the contract) is enforced at runtime. My favorite example of this occurs with the MIDL compiler.
Here's a quick "what's wrong with this code":
foo.idl:
[ uuid(<uuid>), version(1.1), pointer_default(unique),]interface Foo{ HRESULT DoSomething([in] handle_t BindingHandle, [in, string]const wchar_t *StringParam, [out] int *ReturnedValue);}
foo.c:
:
HRESULT DoSomething(handle_t BindingHandle, const wchar_t *StringParam, int *ReturnedValue)
{
if (StringParam == NULL || ReturnedValue == NULL)
{
return E_POINTER;
}
<DoSomething>
}
:
The error is that the code in DoSomething is checking for the StringParam and ReturnedValue being NULL. The contract for the DoSomething function as expressed in the IDL file above specifies that the StringParam and ReturnedValue are not optional[1].
For basic RPC, the RPC runtime library will enforce the functions contract as expressed in the IDL file (I'll talk about COM in a bit). That means that on the client side, if you attempt to call DoSomething with an invalid parameter, the function will fail (it will raise an RPC exception code). On the server side, the runtime library guarantees that StringParam points to a null terminated string and it will allocate the storage to hold the ReturnedValue parameter. If you use the /robust flag (and if at all possible, you should always do this), the RPC runtime library will also protect your call from clients who bypass the client side runtime library - it will filter out all callers who don't provide input that matches the signature[2].
In this example, since the input and output parameters don't have the "unique" or "ptr" attribute, by default they're "ref" pointers[3]. That means that they're always passed by reference, and reference parameters may not be null. As a result, checking for a null value is pointless, since t can never happen.
For COM, it's important to realize that this only happens when the RPC runtime library operates on the parameters. The thing is, COM doesn't always get its hands on the function parameters. In general, the RPC runtime library will only see the parameters for the function when the call has to cross a boundary (either an apartment, process, GIT or IDispatch boundary). If you make a call to a method on an interface within your apartment, then for performance reasons, the RPC runtime library doesn't enter the picture.
Next: Other forms of runtime enforced contracts.
[1] For this example, I'm assuming that DoSomething is only called by the RPC runtime library - if it can be called from somewhere else, the checks may be appropriate.
[2] Please note: The client can still provide a StringParam that points to a 500K long string in an attempt to overflow a local buffer - the only thing that RPC ensures is that the input matches the contract as expressed in the IDL file. There are other ways of ensuring that the string passed in is "reasonable".
[3] Before I start seeing "Stupid Larry: See there in the interface definition - it says that the pointer default is "unique" - why are you saying that the pointers are "ref"?" in the comments: It seems stupid, but that's the way it works - as MSDN documents, the pointer_default only applies to indirect pointers. Top level pointers in parameter lists always default to "ref" pointers.
Comments
Anonymous
January 25, 2007
> The error is that the code in DoSomething is checking Minor point: The effect of this error is what? The program aborts? BSODs? Checks if it's running on a computer outside of Microsoft and deletes partitions if so? I think not. I think that this code has a minor inefficiency, maybe, depending on your next answer. Major point: How does this code know that DoSomething was called through COM? How does it know it wasn't called directly from some C++ code? How does it know it wasn't called from something that was pushed on the stack? How do you know that this wasn't just a diligent programmer minimizing an attack surface?Anonymous
January 25, 2007
Norman, your major point is addressed in footnote 1. I'm making an assumption that it's only called via RPC. If it's not called only via RPC, then it may or may not be appropriate to check the parameter. For your minor point, the answer is that there is some dead code in the application. Why does this matter? For end-users, maybe not much at all. Inside Microsoft? A lot. Microsoft internally is rather obsessive about running code analysis tools on our code. One of the tools we run is a code coverage analysis tool, it's used to verify the percentage of the code paths in our code that are exercised by our test cases. If there are dead code paths in the code, that reduces the percentage of code that's covered by our tests, which means that our metrics for the quality of tests are reduced (essentially the dead code makes the tests look worse than they really are). This CAN be a huge deal.Anonymous
January 25, 2007
> Norman, your major point is addressed in footnote 1. Sorry, I didn't make the connection between RPC and COM. I guess that would be obvious to someone who works with RPC regularly. Still, I think that malware would be able to inject a call to a function which was designed to be called only from RPC just as easily as it could inject a call to a function which was designed for other kinds of clients. The checks still look like diligent reduction of an attack surface.Anonymous
January 25, 2007
Norman, the issue of the hostile client is why I added the comment about the /robust flag - with the /robust flag, a hostile client won't get through the RPC runtime libraries checks. If you don't specify /robust, it's possible that a hostile client might, but...Anonymous
January 25, 2007
> Norman, the issue of the hostile client is why I added the > comment about the /robust flag If malware pushes a function call onto the stack and executes a call to DoSomething, how would the RPC library even know about it, let alone stopping it? I guess we have different points of view because when I wrote COM servers the COM was just a way to get the parameters to the underlying functions, not the only way.Anonymous
January 26, 2007
Norman, how could it do that? The RPC library takes a string of bytes from an IPC mechanism and converts that string of bytes to a call. Along the way it validates the contents of that string of bytes (with the /robust flag) and ensures that it matches the contract expressed in the IDL. If it doesn't, it never lets the call through. One of the things validated is the function call (actually the function ID, but it doesn't matter).Anonymous
January 26, 2007
Ok, if it's not become crystal clear that I'm writing this on-the-cuff, this post will finally put aAnonymous
January 28, 2007
> Norman, how could it do that? By pushing a function call onto the stack. When the CPU executes a machine instruction to do a function call, the CPU isn't going to care whether the function's designer intended for RPC (or COM) to be the only caller. When the CPU executes a machine instruction that was reached via malware, it won't be executing RPC at the time, particularly not executing RPC's parameter validation code. I'm not sure whether to repeat this: >> I guess we have different points of view because when I >> wrote COM servers the COM was just a way to get the >> parameters to the underlying functions, not the only way.Anonymous
January 29, 2007
Norman, how does it "push a function call onto the stack"? Basically (and this is grossly simplified), RPC receives a buffer that contains: FunctionIdToCall, 2 Parameters, Parameter 1: (Direction: In, Type:String, Reference Type: wchar_t *, Value:"12345"); Parameter 2: (Direction: Out, Type: Pointer, Reference Type: int) It checks to make sure that "FunctionIdToCall" matches one of the list of functions to call. It then checks that FunctionToCall takes two parameters, that parameter 1 is an "[in, string] wchar_t * parameter, that parameter 2 is an "[out] int *" parameter. If all those conditions match, it unpacks the values from the message and calls the function. The "function call" is never pushed on the stack (unless the function prototype includes a callback function). In that case, the RPC runtime library inverts the process of making an RPC call. It marshals a pointer to stub function that knows how to package up its parameters and send them to the server. So the server is never able to directly interact with code from the client - it always goes through the translation layer.Anonymous
January 29, 2007
> Norman, how does it "push a function call onto the stack"? The same ways malware has used in the past. > Basically (and this is grossly simplified), RPC receives Only if RPC receives it in the first place. I don't think anything will be gained by repeating one explanation a third time though ... nothing was gained from the first or second time either, so never mind.Anonymous
January 29, 2007
The comment has been removed