Sdílet prostřednictvím


Error and Exception Revisited

Unless suffering is the direct and immediate object of life, our existence must entirely fail of its aim. It is absurd to look upon the enormous amount of pain that abounds everywhere in the world, and originates in needs and necessities inseparable from life itself, as serving no purpose at all and the result of mere chance. Each separate misfortune, as it comes, seems, no doubt, to be something exceptional; but misfortune in general is the rule.

- by Arthur Schopenhauer

In the world of programming, error and exceptions seem to be unavoidable, this is especially true when it comes to writing production quality code.

Windows programming can be challenging when both error and exception are used.

Exception

Exception always exists, even if you don't care, it would disrupt the normal flow of execution. In Windows operating system exception is implemented as SEH/VEH, with the support from CPU.

The key characteristics of exception in Windows are:

  1. Disrupt the normal flow of execution, which translates to pipeline invalidation and slowness.
  2. Can be used consistently in user mode and kernel mode.
  3. Has a lot of great features like continue execution, first chance versus second chance.
  4. Unhandled exception would go to the operating system, which would kill the application (e.g. Dr. Watson) or system (e.g. BSOD).
  5. If you swallow the exceptions used by the operating system or runtime (which you shouldn't catch), the application might not function correctly (e.g. you might get access violation while accessing the PAGE_GUARD page from callstack).

The third famous exception is Out of Memory - for device drivers and server application, you always want to handle it; for client application, probably not as critical (e.g. if Visual Studio IDE is running out of memory, we just let it crash).

The second famous exception is Stack Overflow - for hosting environment and fundamental libraries like CRT, you need to take it into consideration; in other cases it means you have design issue, and normally you don't want to handle it.

Let's take a look at the following pseudo code:

 try
{
  // do something
}
catch(StackOverflowException ex)
{
  log("oops, stack overflow {0}", ex.stack);
  throw ex;
}
finally
{
  // close file handle, etc.
}

 There are several things I can tell:

  1. Having "throw ex" would ruin the exception information, better use "throw" instead.

  2. The "log" function doesn't have much stack space, it could trigger another stack overflow.

  3. When the exception was re-thrown, we are already miles away from the original place of the problem - we are not keeping the scene intact, and nobody would want to debug a dump file for this case.

  4. The process is dying, close file handle will not make it any better, operating system would do that for you. More importantly, it is very likely that you are making things even worse.

  5. If you are using latest version of .NET framework, normally you are NOT allowed to catch StackOverflowException:

    In prior versions of the .NET Framework, your application could catch a StackOverflowException object (for example, to recover from unbounded recursion). However, that practice is currently discouraged because significant additional code is required to reliably catch a stack overflow exception and continue program execution.

    Starting with the .NET Framework version 2.0, a StackOverflowException object cannot be caught by a try-catch block and the corresponding process is terminated by default. Consequently, users are advised to write their code to detect and prevent a stack overflow. For example, if your application depends on recursion, use a counter or a state condition to terminate the recursive loop. Note that an application that hosts the common language runtime (CLR) can specify that the CLR unload the application domain where the stack overflow exception occurs and let the corresponding process continue. For more information, see ICLRPolicyManager Interface and Hosting Overview.

    Windows 95, Windows 98, Windows 98 Second Edition, Windows Millennium Edition Platform Note: A thrown StackOverflowException cannot be caught by a try-catch block. Consequently, the exception causes the process to terminate immediately.

The most famous exception is Access Violation, in normal cases this would be a killer bug which stop the entire development team.

Error

Error is more like a convention, the main characteristics of error are:

  1. Lightweight - doesn't require infrastructure support from operating system and CPU.
  2. Less picky comparing to exception - error wouldn't complain whether you've paid enough attention to it, however it might eventually kill you if not handled properly.

Windows Error Code

Most Win32 API would return BOOL value. If FALSE is returned, the error code is stored in TEB as DWORD, which can be retrieved by using GetLastError.

You can use WinDBG's !ext.gle or Visual Studio's $ERR trick to retrieve the last error code from TEB, which also work for dump files.

The good side of storing last error in a central place (e.g. TEB) is the ability to set Data Breakpoint to see where is the error coming from. Also in later version of Windows there is a feature to do similar things:

Registry API in advapi32.dll is special, Windows error code is returned directly as LONG (signed long).

WinSock and NetAPI have similar concept but different mapping.

NTSTATUS

Lower layer of the operating system makes use of NTSTATUS, which has a similar structure as Win32 error, an incomplete mapping table can be found from Mapping NT Status Error Codes to Win32 Error Codes.

For LSA (Local Security Authority) specifically, the NTSTATUS return value can be converted to Win32 error using LsaNtStatusToWinError.

HRESULT

COM makes use of HRESULT, which was designed to be a super container for all kinds of error codes. There is a macro HRESULT_FROM_WIN32 (p.s. in later version of Windows this has been changed to an inline function) which converts Windows Error Code to HRESULT.

One caveat about HRESULT is that one has to always keep in mind of S_OK, S_FALSE, SUCCEEDED and FAILED, and understand the differences.

 

Since there are so many different kinds of error codes, even people working in Microsoft may get confused, that's why people tend to create tools and make the situation better:

  1. WinDBG extension command !ext.error, which supports both Windows Error Code and NTSTATUS.
  2. Visual Studio debugger.
  3. Error Code Look-up tool, implemented by the Exchange team.