What's wrong with this code, part 6

Today, let’s look at a trace log writer.  It’s the kind of thing that you’d find in many applications; it simply does a printf and writes its output to a log file.  In order to have maximum flexibility, the code re-opens the file every time the application writes to the filelog.  But there’s still something wrong with this code.

This “what’s wrong with this code” is a little different.  The code in question isn’t incorrect as far as I know, but it still has a problem.  The challenge is to understand the circumstances in which it doesn’t work.

/*++
* LogMessage
* Trace output messages to log file.
*
* Inputs:
* FormatString - printf format string for logging.
*
* Returns:
* Nothing
*
*--*/
void LogMessage(LPCSTR FormatString, ...)
#define LAST_NAMED_ARGUMENT FormatString
{
CHAR outputBuffer[4096];
LPSTR outputString = outputBuffer;
size_t bytesRemaining = sizeof(outputBuffer);
ULONG bytesWritten;
bool traceLockHeld = false;
HANDLE traceLogHandle = NULL;
va_list parmPtr; // Pointer to stack parms.
EnterCriticalSection(&g_TraceLock);
traceLockHeld = TRUE;
//
// Open the trace log file.
//
traceLogHandle = CreateFile(TRACELOG_FILE_NAME, FILE_APPEND_DATA, FILE_SHARE_READ, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
if (traceLogHandle == INVALID_HANDLE_VALUE)
{ goto Exit;
}
//
// printf the information requested by the caller onto the buffer
//
va_start(parmPtr, FormatString);
StringCbVPrintfEx(outputString, bytesRemaining, &outputString, &bytesRemaining, 0, FormatString, parmPtr);
va_end(parmPtr); //
// Actually write the bytes.
//
DWORD lengthToWrite = static_cast<DWORD>(sizeof(outputBuffer) - bytesRemaining);
if (!WriteFile(traceLogHandle, outputBuffer, lengthToWrite, &bytesWritten, NULL))
{
goto Exit;
}
if (bytesWritten != lengthToWrite)
{
goto Exit;
}
Exit:
if (traceLogHandle)
{
CloseHandle(traceLogHandle);
}
if (traceLockHeld)
{
LeaveCriticalSection(&g_TraceLock);
traceLockHeld = FALSE;
}
}

One hint: The circumstance I’m thinking of has absolutely nothing to do with out of disk space issues. 

As always, answers and kudos tomorrow.

Comments

  • Anonymous
    September 08, 2004
    In Exit, should check traceLogHandle against INVALID_HANDLE_VALUE.

    Hard coded file name
    Mixing bool & BOOL

    Uses over 4k of stack.

    WriteFile may only write part of the buffer, need to loop until all buffer is written.
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    Good points mschaef:

    I wasn't concerned with the scalability of LogMessage - for 99% of its uses it's not going to affect things enough. This isn't to say that it's not something that can happen, but...

    Opening&Closing the log file CAN be a perf bottleneck. But I wasn't concerned with perf in this example.

    Network access - you're right, that could be an issue, but if there are two instances of the app running locally, it would have the same problem. It's a valid circumstance, realistically there should be a retry loop around the CreateFile call to handle that case.
  • Anonymous
    September 08, 2004
    "I wasn't concerned with the scalability of LogMessage - for 99% of its uses it's not going to affect things enough."

    Part of my concern was scalability, part was debug-ability. I have visions of apps using this function failing with logging turned off, and succeeding with logging turned on, thanks to the extra synchronization in LogMessage.

    I guess a scalable log facility might maintain a queue for each worker thread, and a seperate log thread that waits for incoming log messages on any of the source queues, and dumps them to disk. At least then, the worker threads would only potentially serialize with the log thread. This approach would raise some interesting issues surrounding message ordering...
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    Why can't you use OutputDebugString and keep the complexity of writing it to a file in another process?
  • Anonymous
    September 08, 2004
    Because OutputDebugString only outputs stuff when there's a debugger attached. And the idea behind logging tools like setupapi.log or other tools is that they passively log until there's a problem, then someone can use the debug log to diagnose the problem.

    No tools needed, just copying a text file to a network share.

    Logging to a file can be an extraordinarily powerful tool, especially when trying to debug complicated distributed failures.
  • Anonymous
    September 08, 2004
    Hmm, I don't think I have the answer. But a few things:

    1. You are not checking the return value of StringCbVPrintfEx, your buffer may be too small, thus truncating the log message.

    2. The last 2 params of StringCbVPrintfEx could be null, it is an error to pass null args without appropriate flags.

    -Sayavanam
  • Anonymous
    September 08, 2004
    Two comments:

    1. You're mixing charset "neutral" APIs like StringCbVPrintfEx() with charset "specific" types like LPCSTR. If the format string is always LPCSTR, then you should probably use StringCbVPrintfExA(). Otherwise, you should probably use LPCTSTR for the FormatString.

    2. Is it safe to pass outputString as the first and third parameters to StringCbVPrintfEx()? (The same goes for passing bytesRemaining as the second and third parameters.) It seems like one of those things that should be OK, but the documentation doesn't guarantee it.
  • Anonymous
    September 08, 2004
    Call me paranoid, but I'd feel a lot safer with code like this if there was a try/finally (or equivalent SEH).
    --Grant
  • Anonymous
    September 08, 2004
  1. Not using TCHAR, LPTSTR with generic StringCbVPrintfEx

    2) Not checking return code from StringCbVPrintfEx. If a failure occurs, can you count on bytesRemaining?

    3) Not checking FormatString for NULL

    4) Are size_t and DWORD safe to use interchangibly under 64 bit?
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    Well, if you compile it with UNICODE turned on you'll get a compilation error from the StringCbVPrintfEx call because you've declared everything else as byte-oriented strings. Changing it to StringCbVPrintfExA should sort that out, though.

    Could it be that you're not impersonating when the first call is made, and the file doesn't exist yet, but subsequently you're impersonating? If you're running as one of the service accounts, the NULL lpSecurityDescriptor causes the file to be created owned by that user. The subsequent calls to LogMessage, if we're impersonating at the time, fail at the CreateFile call because the user doesn't have permissions to write to the file.

    The solution is to specify an appropriate security descriptor when creating the file.
  • Anonymous
    September 08, 2004
    Good catch with the impersonation issue there. Sounds plasubile if the function is called during service startup, and then mostly while handling client requests.
  • Anonymous
    September 08, 2004
    Bingo! That's the problem I was thinking about Mike.

    But the problem doesn't only happen with service accounts. It can happen with normal users as well.

    Skywing: If I had anything to do with it, try/except should be removed from the language. You're right that EnterCriticalSection can throw, but that only happens if you didn't call InitializeCriticalSectionEx - I purposely left that call out (for size reasons), but the semaphore isn't the problem.

    StringCbVprintfEx doesn't allocate heap memory to my knowledge.
  • Anonymous
    September 08, 2004
    Remember that InitializeCriticalSectionAndSpinCount doesn't exist on NT4, though. So you can't avoid the problem there unless you use undocumented APIs.
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    Skywing: that's true, but I'll be honest and say that for these "what's wrong with this code" examples, they're generally about showing up a common mistake, or general principal. As a result, I don't tend to consider issues like code working on lower level revisions of the OS from the current one.
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    I think you misread the code there; traceLockHeld is a local and not a static or global variable.
  • Anonymous
    September 08, 2004
    Actually EnterCriticalSection can throw if you are low on memory on the machine as well. There are many reasons things can through, and catch them will always save you. But everything has its pros can cons. My basic idea is that you write all code "throw" safe, and only guard crucial common section and the outer limits of each thread, so noone thread can bring down the process.

    Nice catch on the file securities, totally forgot about that one, happened to me a few times, but so easy to forget =)
  • Anonymous
    September 08, 2004
    Niclas, as to what's wrong with try/catch, consider this:
    http://blogs.msdn.com/LarryOsterman/archive/2004/05/18/134471.aspx

    Also, try/catch catches C++ exceptions, not SEH exceptions.

    It's VERY difficult to catch SEH exceptions correctly, 90% of the people out there don't know how to do it (consider the guard page exception mentioned in the linked post).

    Simply wrapping code in a try/except is absolutely NOT the way to go - that's guaranteed to be incorrect.
  • Anonymous
    September 08, 2004
    Only users with enough privileges may access the log file, that's not necessarily a problem.

    I don't see the point of traceLockHeld. The first thing the function does is a call to EnterCriticalSection, so it should always call ExitCriticalSection before exiting.

    Anyway, I'm glad this code uses goto instead of complicated nested if's or exceptions.
  • Anonymous
    September 08, 2004
    B.Y. There's a lot of code that was excised, and in that code, the TraceLockHeld variable made sense.
  • Anonymous
    September 08, 2004
    You could also use InitializeCriticalSectionAndSpinCount instead of EnterCriticalSection in case of low memory situations.
  • Anonymous
    September 08, 2004
    I was too late to throw in my 2 cents but I had a hunch that it had something to do with the recent bombardment of SID related posts.

    It does, in a NTFS sort of way, round-about, kind of.

    Having a file created with a service account is fun to try and read from say a normal user. Also in some instances Administrators can be locked out from certain service accounts so under certain circumstances it's possible to create a file that no one has access to. You would have to code some kind of cleanup mechanism to fix it, if such a rare occurance did happen.
  • Anonymous
    September 08, 2004
    Yup, actually this one was only vaguely related - someone came to my office on Friday and mentioned they were having a problem related to this issue, that's why I decided to write it up.

  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    Niclas, you bring up enough interesting points that I want to take them in turn in a separate post, so I'm going to defer responding to that part until somewhat later.
  • Anonymous
    September 08, 2004
    Although this was not an intended answer, in my experience it is one: When CreateFile fails it can return either 0 or -1. I don't recall which of these values is INVALID_HANDLE_VALUE, but it doesn't matter. A test for failure has to do <= 0 instead of == INVALID_HANDLE_VALUE.

    Another unintended answer: Any handle might happen to be the straw that breaks the camel's back, yielding a BSOD.
    http://support.microsoft.com/?scid=kb;ja;195857
    (According to the article MS only reproduced the problem when the handles are for registry keys, but the description implies that the problem can happen with any kind of handle.)
  • Anonymous
    September 08, 2004
    IIRC, that bug is only present on Win9x, and only then when opening handles to drivers.
  • Anonymous
    September 08, 2004
    I agree with you Larry, these are very interesting topics, sadly and usually it becomes a religious war, especially for parameter validation.

    I wish there was a simple rule above all, my simple rule is, never crash. But then again I wouldn't want to take that to the extreme either, sometimes you have no options but to restart the application.

    There are cons and pros with everything and I eagerly await your answer, as it is always fun to get insight into more intricate problems =), besides you have a ton of interesting points.


    I would also like to add that of course even if the application mentioned in my example would reboot all night until support personal got there (usually within the hour depending on how sleepy they are) it would redirect all calls to a backup system if it notices that it is thrashing a certain protocol. The backup system usually is a low capacity hardware based receiver.

    What we wanted to prevent was that one protocol could group punish the rest of the system =). And basically we know, even if we try to write fault free code, we know that we will make mistakes, so it is safer to also make it fault tolerant.
  • Anonymous
    September 08, 2004
    Regarding exceptions, I totally agree with you Larry, that an API like this should not swallow unexpected exceptions.

    Deciding on life-or-death issues for the whole process in the face of unexpected exceptions is the responsibility of a higher-level function.

    Not something that a low-level function like LogMessage should do.

    But, I think that to be perfectly safe you should still use __try __finally to avoid leaving open handles and held critical sections.
    Each function should guarantee that it performs its own cleanup properly even under unexpected circumstances.

    You could rewrite the function like this:

    void LogMessage(LPCSTR FormatString, ...)
    #define LAST_NAMED_ARGUMENT FormatString
    {
    CHAR outputBuffer[4096];
    LPSTR outputString = outputBuffer;
    size_t bytesRemaining = sizeof(outputBuffer);
    ULONG bytesWritten;
    bool traceLockHeld = false;
    HANDLE traceLogHandle = NULL;
    va_list parmPtr; // Pointer to stack parms.
    EnterCriticalSection(&g_TraceLock);
    __try {
    //
    // Open the trace log file.
    //
    traceLogHandle = CreateFile(TRACELOG_FILE_NAME, FILE_APPEND_DATA, FILE_SHARE_READ, NULL, OPEN_ALWAYS, FILE_ATTRIBUTE_NORMAL, NULL);
    if (traceLogHandle == INVALID_HANDLE_VALUE)
    // return inside a __try block passes control to the __finally block
    return;
    //
    // printf the information requested by the caller onto the buffer
    //
    va_start(parmPtr, FormatString);
    StringCbVPrintfEx(outputString, bytesRemaining, &outputString, &bytesRemaining, 0, FormatString, parmPtr);
    va_end(parmPtr);
    //
    // Actually write the bytes.
    //
    DWORD lengthToWrite = static_cast<DWORD>(sizeof(outputBuffer) - bytesRemaining);
    if (!WriteFile(traceLogHandle, outputBuffer, lengthToWrite, &bytesWritten, NULL))
    // return inside a __try block passes control to the __finally block
    return;
    }
    }
    __finally {
    if (traceLogHandle != INVALID_HANDLE_VALUE) {
    CloseHandle(traceLogHandle);
    }
    LeaveCriticalSection(&g_TraceLock);
    }
    }


    Imagine that the user of LogMessage tries to log a message with a badly formatted printf-string - perhaps using %s instead of %d for example.

    Your original code would have left the critical section held which would mean that no other thread in the process could log anything.

  • Anonymous
    September 08, 2004
    The comment has been removed
  • Anonymous
    September 08, 2004
    buffer overflow?!?!?!
  • Anonymous
    September 09, 2004
    "Because OutputDebugString only outputs stuff when there's a debugger attached."

    This isn't true. And since I don't remember the details of how to get at this without a debugger, I will just say go to www.sysinternals.com and look at the debugview it doesn't attach to a process, it just intercepts the messages.


    I actually wrote my own version of debugview before discovering the tool, perhaps I will dedicate some blog space to how to achieve this.


    Evan
    http://EvanFreeman.blogspot.com

  • Anonymous
    September 09, 2004
    Evan, there needs to be SOME application running to intercept the debug trace calls. They don't get queued up somewhere waiting for them to be written.

    So if the user isn't running the trace monitoring tool, they don't see the output.

    Which means that we lose an opportunity to post-mortem a failure.

  • Anonymous
    September 09, 2004
    I will just add another thought to this, since I never really pointed it out.

    I do agree that lower level functions should not do exception handling, it is the responsibility of the higher level logic.

    However in this case, the log function is indeed acquiring a lock, which it must not ever forget to release, or it will block all threads in the entire process. That is why this is an exception to the rule. And API:s that handle internal resources must also make sure to supervise and care for them.

    Which is why I proposed using SEH handling with try/finally, but it can also be achieved with C++ exception handling, which ismore portable.

    And about performance of a logger you have a few options, they all have pros and cons.

    One way is to merely queue the log call, and let another low prio thread handle the actual logging. However if you do that, you might not get the logs during a failure(if you don't care for exception handling properly).

    My own thoughts though are that you might as well do all the work right away, since logging should be an exception or failure case, which means we most likely have time to log.


    There are however instances when you do want a seperate thread doing the actual logging, for instance if you are logging from a message pump to a window that message pump is handling. If you log anything before the message pump starts you will hang (if you are using synchronous signalling).
  • Anonymous
    September 12, 2004
    9/10/2004 12:14 AM Niclas Lindgren

    > since logging should be an exception or
    > failure case,

    Not really. In the Windows world, remember that even Windows 95 would write boot logs reporting successful driver start-ups, because at the point where the log ended the user might be able to guess that the failure came after that point. In the modern Windows world, it is often prudent to log every successful login, and maybe successful access to files and registry keys etc. In the Internet world, consider that it is often prudent to log every packet that goes in or out.