Properties coding expedition #5 - Stripping characters

In Part 4, I discovered that WideCharToMultiByte converts certain invisible non-spacing Unicode characters to ?. This makes the output look really silly in a command line application. I want to keep this as a command line application, so I need to strip these characters away. A simple helper solves this rather neatly:

 void _StripCharacters(__inout PWSTR pszText, __in PCWSTR pszRemove)
{
    PWSTR pszSource = pszText;
    PWSTR pszDest = pszSource;
    while (*pszSource)
    {
        // Skip copying characters found in pszRemove
        if (!StrChr(pszRemove, *pszSource))
        {
            *pszDest = *pszSource;
            pszDest++;
        }
        pszSource++;
    }
    *pszDest = 0;   // NULL terminate
}

This modifies the input string, omitting any characters found in pszRemove. Nothing fancy. Now I call it when I want to send a string to the console:

 ... from part 3 ...
PWSTR pszValue;
hr = ppropdesc->FormatForDisplay(propvar, PDFF_DEFAULT, &pszValue);
if (SUCCEEDED(hr))
{
    // LRM RLM LRE RLE PDF LRO RLO
    _StripCharacters(pszValue, L"\x200e\x200f\x202a\x202b\x202c\x202d\x202e");  
    wprintf(L"%s: %s\n", pszLabel, pszValue);
    CoTaskMemFree(pszValue);
}
...

Now the output is free of those annoying question marks:

 Date last saved: 9/29/2006 10:12 PM
Width: 1139 pixels
Height: 769 pixels
Horizontal resolution: 200 dpi
Vertical resolution: 200 dpi
Bit depth: 24
Dimensions: 1139 x 769

Comments

  • Anonymous
    November 05, 2006
    This coding expedition has developed a tool that can dump out all the properties on a file. If you are