Exploring Buffer Overrun Vulnerability
The concepts of buffer overrun attacks are well documented and most developers are aware of the technique and broadly how it works, but have you ever tried to write code to demonstrate buffer overrun? I have a session to present next month on a generic ‘security for developers’ topic and although the concepts of secure mobile development are totally consistent with desktop development some of the attack techniques tend to be a little more sophisticated. So in the interest of completeness I felt I ought to really understand what a buffer overrun attack looked like …
The principal is pretty straight forward: unchecked input parameters causes program logic to write data outside the expected storage area causing the code to behave differently. The most well know form of buffer overrun on Intel hardware is where code writes data beyond the end of a local variable and stamps on the function return address. Ok, so how do you reproduce this?
Step one to reproducing a buffer overrun is to write an unsafe function. It’s not hard!
#include "stdafx.h"
…
BOOL APIENTRY DllMain( HANDLE hModule,
DWORD ul_reason_for_call,
LPVOID lpReserved
)
{
…
}
HACKPOINTDLL_API long fnConvertToNumber(LPTSTR InputString, long Length, long base)
{
TCHAR LocalCopy[2048];
long retVal;
memcpy (LocalCopy,InputString,Length);
sscanf(LocalCopy,_T("%i"),&retVal);
return retVal;
}
This is a simple DLL I created in C++ (the easiest language to show buffer overruns in – but not necessarily the only language!) with one exported function fnConvertToNumber that takes a string and converts it using sscanf to a 32 bit integer representation.
This function has a local (stack based) variable for storing the string, and also a local 4 byte return value variable.
What makes this function prone to attack is the lack of input checks on the Length parameter, specifically at this line:
memcpy (LocalCopy,InputString,Length);
If Length is set greater than 2048 what is going to happen? The memcpy will simply copy more bytes onto the stack up to Length bytes. To rebuild the stack at this point you need to remember that the convention on Windows / Intel hardware is to count the stack down in memory not up. Using this code as the calling function:
fnConvertToNumber (_T("12345"),6,10);
here is what the stack looks like at this point (using the equivalent of a release build, stack location is contrived, base 16 and starts at 1000):
<1000> 4 bytes: 0x0A - this is the last parameter in the call
<099C> 4 bytes: 0x06 - the second parameter
<0998> 4 bytes: <address of string> - points to the static string that was placed in the static data segment
<0994> 4 bytes: <return address> - value Instruction Pointer to be set to when exiting this function (it’s the instruction after the ‘call’)
<0990> 4 bytes: <retVal variable> - contains the retval variable contents
<0190>0x800 bytes: <LocalCopy variable> - contents of the 2048 byte local string buffer
The memcpy function works *up* in memory not down, so it starts at <0194> and copies upwards until Length bytes are written. So if Length was set to 0x804 the first thing that gets stamped on is the return address!
Hopefully the code will crash and nothing more than a broken transaction will occur, but what happens if the InputString actually contains some code, and instead of just stamping on the return address, it sets the return address to <0194>? Instead of returning to the caller the embedded code will get executed.
Here is my code doing exactly that:
BYTE Buffer[StackSize+4];
strcpy((TCHAR *)Buffer, _T("12345"));
HANDLE hFile = CreateFile(argv[2],…);
DWORD BytesRead = 0;
if (0==ReadFile(hFile,&Buffer[0x10],StackSize-0x10,&BytesRead,NULL))
{
printf(_T("Failed to read the file: %\n"),GetLastError());
return 1;
}
long espVal=0;
__asm
{
mov espVal, esp ; get the stack value at this point.
}
// Calc the start location of the code:
// ret address = esp -10
// Buffer start = esp -0x10 - StackSize
// Code Start = esp - 0x10 - StackSize +0x10 (offset to start of code
((long *)Buffer)[StackSize/4] = espVal-StackSize;
retVal = fnConvertToNumber((TCHAR *)Buffer,StackSize+4,10);
The injected code is placed in a binary file and loaded dynamically using a command line parameter. I also placed a string in the buffer first so it at least looks like a proper buffer, and I allowed up to 16 bytes for this string – hence all the 0x10’s around.
In this example I dynamically calculate new function return value using the __asm statement to get the stack pointer just before calling the function – I needed this because my code base was changing as I wrote and modified the code. That freedom is not afforded a real attacker but then execution address is relatively predictable in production code environments.
The big question is: how can I stop this from being a vulnerability? Here are a few options:
- Check your input! That has to be the best option – code defensively especially when it comes to input. If I had checked and capped the length parameter to the buffer size then none of this is possible.
- Visual C++ 8.0 (from Visual Studio 2003) supports several options to provide stack checks : /RTCs for non optimized builds and /GS. With these options the generated code stores a copy of the return address pointer *after* the local variables on the stack, and restores it before returning. By default this is switched on in debug mode. So consider switching /GS on for release builds. NOTE: there *will* be a performance impact and your binaries will need to be re-tested.
- Switch to C# or VB.NET – part of the job for the CLR during the code verification is to check for array out of bounds situations just like this one. If one is detected then the whole method fails to compile. For C# be aware that code marked as UNSAFE cannot be verified and so could be vulnerable to a buffer overrun attack – that threat can be mitigated by executing in least privilege to stop unsafe code being allowed.
So what does all this prove?
Nothing really, besides me getting a real kick just writing this code. You’ve heard it said ‘ keep your friends close but your enemies closer’, so as an educational exercise it proves to me the sort of lengths a hacker will go to compromise a system.
Ok, so what next? This is a bit of a contrived scenario. What I want to do is put this code somewhere deep in a web based system – maybe as an extended stored proc called from a stored procedure that is in turn called from an ASP.NET page and see if I can still get at the vulnerability. I will post the code if I can get it going.
Does this apply to Windows Mobile? I’m not familiar enough with the ARMV4 core architecture or the instruction set to give a definitive answer but from my limited investigations this specific vulnerability is less obvious: it appears the stack grows ‘up’ rather than ‘down’ in memory, so the return address is stored below the local variable position. Don’t use this as an excuse to be complacent, all input should still be considered evil and checked thoroughly before use.
Marcus
Comments
- Anonymous
February 24, 2005
I forgot to add this MSDN reference:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dncode/html/secure05202002.asp
This is Michael Howard's much more complete discussion of the subject.
Marcus - Anonymous
February 28, 2005
The CE Application Binary Interface for ARM uses the stmdb/ldmia instructions for storing any register values that must be saved. They stand for 'STore Multiple Decrement Before' and 'LoaD Multiple Increment After' respectively. There are no 'push' or 'pop' instructions in the architecture as such. A frame is created if necessary using the R11 register as a frame pointer and a SUB instruction.
Unlike x86, the return address is not stored on the stack. The bl (Branch and Link) instruction stores the return address in the R14 register, conventionally aliased as lr (Link Register). By convention R13 is the stack pointer but there's no hard-wiring.
It's pretty obvious though that this use of a Link Register can only work one level deep. If the called function itself needs to make a call, it must store the previous value of the link register - by convention, on the stack. This does mean a stack buffer overflow can cause the return address to be overwritten, but it may not be this method - it might be one a bit higher up the call stack.
Interestingly there's no return instruction either. You simply use mov pc,lr, or pop the return address from the stack with ldmia.
Most RISC architectures are much the same.
CE is not vulnerable to the exception frame pointer corruption scenario that Windows NT-family/x86 is. CE uses table-based exception handling, where the exception tables are paged in from the executable on demand. The current program counter value is looked up in the tables, if an exception occurs, to determine how to unwind the stack and which handler functions to call.