Recursively Deleting a directory–with long filename support.

I recently was updating some test code to handle long filename (longer than MAX_PATH) support.

My initial cut at the function was something like the following (don’t worry about the VERIFY_ macros, they’re functionally equivalent to asserts):

 const PCWSTR LongPathPrefix=L"\\\\?\\";

void RecursivelyDeleteDirectory(const std::wstring &strDirectory)
{
    //  Canonicalize the input path to guarantee it's a full path.
    std::wstring longDirectory(GetFullPath(strDirectory));

    //  If the path doesn't have the long path prefix, add it now before we instantiate the
    //  directory_list class.
    std::wstring strPath;
    if (longDirectory.find(LongPathPrefix) == std::wstring::npos)
    {
        strPath = LongPathPrefix;
    }
    strPath += longDirectory;
    strPath += L"\\*";

    directory_list dl(strPath);
    for (const auto && it : dl)
    {
        std::wstring str(longDirectory+L"\\"+it);

        //  It’s possible that the addition of the local filename might push the full path over MAX_PATH so ensure that the filename has the LongPathPrefix.
        if (str.find(LongPathPrefix) == std::wstring::npos)
        {
            str = LongPathPrefix+str;
        }

        DWORD dwAttributes = GetFileAttributes(str.c_str());
        VERIFY_ARE_NOT_EQUAL(dwAttributes, INVALID_FILE_ATTRIBUTES); // Check for error.
        if (dwAttributes & FILE_ATTRIBUTE_DIRECTORY)
        {
            if (it != L"." && it != L"..")
            {
                RecursivelyDeleteDirectory(str);
            }
            else
            {
                VERIFY_WIN32_BOOL_SUCCEEDED(DeleteFile(str.c_str()));
            }
        }
    }
    VERIFY_WIN32_BOOL_SUCCEEDED(RemoveDirectory(longDirectory.c_str()));
}

The weird thing was that this code worked perfectly on files shorter than MAX_PATH. But the call to GetFileAttributes failed 100% of the time as soon as the directory name got longer than MAX_PATH. It wasn’t that the GetFileAttributes API didn’t understand long filenames – it’s documented as working correctly with long filenames.

So what was going on?

I wrote a tiny little program that just had the call to GetFileAttributes and tried it on a bunch of input filenames.

Running the little program showed me that \\?\C:\Directory\FIlename worked perfectly. But [\\?\C:\Directory\](file://\\?\C:\Directory\). (note the trailing “.”) failed every time.

It took a few minutes but I finally remembered something I learned MANY decades ago: On an NTFS filesystem, the “.” and “..” directories don’t actually exist. Instead they’re pseudo directories inserted into the results of the FindFirstFile/FindNextFile API.

Normally the fact that these pseudo directories don’t exist isn’t a problem, since the OS canonicalizes the filename and strips off the “.” and “..” paths before it passes it onto the underlying API.

But if you use the long filename prefix (\\?\ the OS assumes that all filenames are canonical. And on an NTFS filesystem, there is no directory named “.”, so the API call fails!

What was the fix? Simply reverse the check for “.” and “..” and put it outside the call to GetFileAttributes. That way we never ask the filesystem for these invalid directory names.

Comments

  • Anonymous
    November 16, 2015
    The comment has been removed

  • Anonymous
    November 16, 2015
    The comment has been removed

  • Anonymous
    November 16, 2015
    Whether or not FAT filesystems support filenames longer than MAX_PATH, FAT filesystems can still be accessed with ? paths. But still, it seems odd to not have a unified set of filename semantics, because that means that if you simply need to support long paths, you need to know how to handle the filename semantics of every possible underlying filesystem your code might run on - including those of filesystems that haven't been written yet! Compared to that, it would seem to make much more sense to require filesystems to support a common set of semantics, and require that the NTFS driver handle "." and ".." entries whether or not they exist on disk. Also, I'd note that you had already special-cased the "." and ".." entries in your original code. Just not in a way that worked, due to the inconsistent semantics.

  • Anonymous
    November 16, 2015
    The comment has been removed

  • Anonymous
    November 16, 2015
    "When you use ? you're essentially taking the training wheels off. Part of that means that you have to know what you're doing." Thanks for this post, Larry.  The MSDN docs make it seem like "?" is just a way to circumvent the MAXPATH length limitation and don't make the broader implications clear at all.

  • Anonymous
    November 17, 2015
    welcome back to blogging Larry! its been a while..

  • Anonymous
    November 17, 2015
    Did you get out of jail Larry? I haven't seen a post from you since when, 2012? :)

  • Anonymous
    November 17, 2015
    The comment has been removed

  • Anonymous
    November 19, 2015
    Karellen: There is no such long-filename equivalent. The problem is that there are too many apps that depend on FFF/FNF returning the "." and ".." entries for such an API to be successful. The FindFirstFileEx API supports long filenames (FindFirstFile is not documented as supporting long filenames).