Finally starting a blog

I have been putting this off for a while. Not out of concern with sharing myself in public - I've been posting on the net in various forums for around the last 15 years, and anyone good with a search engine can find all sorts of things I've said and done. It has really been more of an issue with inertia. I have a few topics I've accumulated - here's the first -

 What's in a crash? This is something that came up recently when a fairly well-known bunch of security researchers reported what they thought was a problem. Here's the deal - there are 3 kinds of crashes:

1) Your code blew up, and you're about to get 0wn3d. Yup, it's exploitable, and the customers are not going to be happy.
2) Your code blew up, and maybe it is exploitable, maybe not.
3) Your code blew up, and you meant it to blow up, and it's clearly not exploitable.

 Overall, we're pretty familiar with the first two sorts of crashes. The first is what we ship bulletins over, and the second is what we lose sleep over trying to sort out if someone more clever than we are could really make an exploit out of it. I have written many times and in many places, JUST FIX THE <!$@%>@^* BUGS!!! If it does NOT crash, then there's no need to lose sleep over whether it is exploitable or not. Solid code tends to be secure code. However, if the product is pretty close to shipping, we need to be able to give some reasonable judgement on whether something is exploitable, obviously erring on the side of caution. Maybe I'll elaborate on that aspect of the problem later.

The third problem is what tripped up our friends who make a living finding other people's mistakes. In Office 2007, and quite a few places in other Microsoft code, we've made use of my SafeInt class. SafeInt is designed to ensure that arithmetic is either mathematically correct, or an exception happens. You get to pick what sort of exception you like, and whether to catch it. By default, it throws C++ exceptions, but many of the users have chosen to take Win32 exceptions. This is implemented as:

void declspec(noreturn) SafeIntOnOverflow()
{
   RaiseException( STATUS_INTEGER_OVERFLOW, EXCEPTION_NONCONTINUABLE, 0, 0 );
}

If you're one of those people who like to find issues in our code, and you happen to see this exception, it means that we have caught you, no security bulletin with your name in lights, do not pass go. Obviously, if you have managed to find some other problem, and have managed to first tromp on an exception record, then that was the problem, and this was just the trigger.

There's an interesting aspect of this that came to light - when you declare something declspec(noreturn), then the compiler doesn't expect you to come back from the call, and if you did, the stack would be corrupted when you did. In fact, you'd find yourself jumping to the address corresponding to the first argument pushed on the stack prior to the call. If you play with it, it looks user-controlled. The problem (for the attacker) is that you can only return from such things in a debugger, which is a fairly common thing to have hooked up while fuzzing. This would potentially lead one to make the mistaken assumption they'd found something exploitable, when it actually wasn't.

SafeInt isn't the only tool we use that can have this effect. Some apps might just decide that things are truly awful if someone tried to allocate a negative (from the perspective of an int) amount of memory, throw an exception, and crash on purpose. We have other tools that deal with pointers and arrays that behave the same way. The theory is that it is better to crash (at least with client apps) than it is to be running the bad guy's shell code.

You may rightfully say that crashing is always bad, and having a server-class app background, I agree. Crashing means you made a mistake, bad programmer, no biscuit. However, crashing may be the lesser of the evils in many places. In the event that our apps crash, we have recovery mechanisms, ways to report the crash so we know what function had the problem, and so on. I really take issue with those who would characterize a client-side crash as a denial of service. If you can crash my app so that I can't restart it, or have to reboot my system, well, OK - that's a DoS. If you blew up my app, and I just don't load that document again, big deal. On the server side, all crashes are bad - though it is still better to drop the service than to give the attacker a command prompt.

Additionally - and this is from a solid C++ engineering standpoint - you do not ever want to catch exceptions unless you can return to a stable state. In order to write exception-safe C++ code, you have to make sure that every allocation, and every resource obtained is returned in the appropriate destructor, and unwinding the stack does the right thing. This is a bit of work for fresh code, but trying to retro-fit it into old code that didn't ever plan on exceptions is not going to yield stable results. You are then left with two good options (other than rewriting most of your code into proper C++) - either catch the exceptions within the function and deal with things locally and carefully, or you call what a friend used to term the CatchFireAndBurn() function - take the app down immediately, don't execute more code, and leave a trail so you know what happened.

BTW, SafeInt 3.0 is coming fairly soon, once it gets out of testing. Lots of cool stuff in there, and I'll write about that later.

Comments

  • Anonymous
    April 12, 2007
    So it's not a security issue alright.... But it's still a bug. The normal behavior of Word would be to tell the user "your document is malformed and I can't open it" instead of lamely crashing. [dcl] Yup, bad programmer, no biscuit. If it crashes, it's a bug, bad user experience, not what I like to see shipping. But not all bugs, or all crashes are exploits. I can understand that fixing that types of bugs was not really a priority for a company which was some yrs late to deliver its main product: an updated version of the OS. [dcl] The programmers who shipped Vista work in Windows, which is a whole different division than the one that ships Word (pretty much on time, thankyouverymuch). If it had been a known bug, it would have been fixed, and now that it is a known bug, it will be fixed. Bugs don't have to be exploitable to merit getting fixed. Crashes are bad, ok..

  • Anonymous
    April 13, 2007
    PingBack from http://blogs.imperium.org/zorloc/archives/201

  • Anonymous
    April 13, 2007
    I disagree.  Saying you can either crash or get owned is a false dilemma. Crashing instead of getting owned does not help the customer, because he can still lose his data.  He won't get a worm (unless you missed some other wormable issue), but still, that's just reducing the severity from "critical" to "moderate".  It's still a bug.  The customer still wants it fixed.  The only one who has an actual advantage of this is you, because you only have to answer for a DoS, not a worm. [snip] [dcl] Customer doesn't lose any data. You double-click on the bad file, it didn't have any data you wanted. Any files you had open you did want get caught with auto-save or doc recovery. That's what makes it a nuisance, not a vuln. Some of the rest of what you said, I'm agreeing with - crashing is bad, ok. Crashing is still better than running arbitrary code. A lot of the rest of the discussion is around the engineering aspects of understanding and recovering from the flaw, which is a longer topic than I'm going to get into today.

  • Anonymous
    April 13, 2007
    The comment has been removed

  • Anonymous
    April 16, 2008
    Must be synchronicity. I started out the day with a really interesting mail from Chris Wysopal talking