Issue description
I am creating an abstraction layer over portable executable (PE) resources, and am trying to identify invalid inputs that should be rejected. Through testing I got the impression that certain non-ASCII lower case letters in resource names may not be correctly handled by resource-related Win32 APIs. For example, using LoadIconW with the icon resource name MyIcon
or MyIcÖn
(upper case o umlaut) works fine, but MyIcön
(lower case o umlaut) fails to load.
Basic reproduction steps
- Create the following files:
-
app.ico
(any PE-compatible icon file)
-
app.cpp
#include <iostream>
#include <Windows.h>
#pragma comment(lib, "user32")
int main() {
const auto exe = GetModuleHandle(nullptr);
for (const auto& name : {
L"myicon", L"MyIcon", L"mYiCoN", L"MYICON", // variants of o
L"myicÖn", L"MyIcÖn", L"mYiCÖN", L"MYICÖN", // variants of Ö
L"myicön", L"MyIcön", L"mYiCöN", L"MYICöN", // variants of ö
}) {
const auto icon = LoadIconW(exe, name);
std::wcout << L"Could load \"" << name << "\": " << (icon != nullptr ? "yes" : "no") << std::endl;
}
return 0;
}
-
app.rc
MyIcon ICON app.ico
-
run.bat
rc.exe app.rc
cl.exe /EHsc app.cpp app.res
app.exe
- Execute
run.bat
from a Visual Studio Developer command prompt
- Change the resource name in
app.rc
to MyIcÖn
, and execute run.bat
again
- Change the resource name in
app.rc
to MyIcön
, and execute run.bat
again
Actual behavior
Steps 2 - 4 result in:
|
Arg |
Can load MyIcon |
Can load MyIcÖn |
Can load MyIcön |
|
myicon |
Yes |
No |
No |
|
MyIcon |
Yes |
No |
No |
|
mYiCoN |
Yes |
No |
No |
|
MYICON |
Yes |
No |
No |
|
myicÖn |
No |
Yes |
No (!) |
|
MyIcÖn |
No |
Yes |
No (!) |
|
mYiCÖN |
No |
Yes |
No (!) |
|
MYICÖN |
No |
Yes |
No (!) |
|
myicön |
No |
Yes (?) |
No (!) |
|
MyIcön |
No |
Yes (?) |
No (!) |
|
mYiCöN |
No |
Yes (?) |
No (!) |
|
MYICöN |
No |
Yes (?) |
No (!) |
Observations:
- The resource of step 4 (
MyIcön
) can not be loaded at all.
- The resource of step 3 (
MyIcÖn
) can be loaded with both Ö
and ö
.
- Resource names in the PE file are stored in UTF-16 encoding, allowing for general support for non-ASCII letters. For example,
𝄞
is stored as the surrogate pair 0xd834, 0xdd1e
.
-
rc.exe
converts resource names to upper case ASCII letters, so MyIcon
becomes MYICON
in the output files.
- Non-ASCII lower case letters are not treated the same way, with
MyIcön
becoming MYICöN
instead of MYICÖN
.
- The argument passed to LoadIconW appears to undergo upper case conversion (ASCII and non-ASCII❗, potentially based on
%WinDir%\System32\l_intl.nls
) before being compared with the raw, length prefixed resource name buffer.
Expected behavior
Consistent handling of all (lower case) letters. Including a resource name like MyIcön
should either be converted to the expected upper case format by rc.exe
(and the exact letter mapping should be publicly documented for cross-compilers, which do not run on Windows, to implement), or be loadable the way it is (e.g. Ö==Ö, Ö!=ö, ö==ö
, changing the Yes (?)
entries in the table above to No
). Being able to load such a resource would be preferable, as it would simplify generating and parsing/validating PE files.
Tested versions
- C++ Compiler:
19.42.34436
- Resource Compiler:
10.0.10011.16384
- Windows:
Windows 11 23H2 (22631.4751)
Slightly more involved reproduction steps
Resources can be enumerated using EnumResourceTypesExW, EnumResourceNamesExW, and EnumResourceLanguagesExW. Replace the app.cpp
content with the following code and repeat steps 2 - 4 of the basic reproduction steps:
#include <iostream>
#include <Windows.h>
#pragma comment(lib, "user32")
void printLastError(_In_ LPCSTR lpMessage) {
const auto dwLastError = GetLastError();
std::cout
<< lpMessage
<< ": "
<< dwLastError
<< std::hex
<< " (0x"
<< dwLastError
<< ")"
<< std::dec
<< std::endl
;
}
void printResourceString(_In_ LPCWSTR lpString) {
if (IS_INTRESOURCE(lpString)) {
std::cout << (uint16_t)lpString;
} else {
std::wcout << lpString;
std::cout << " (" << std::hex;
bool printWhitespace = false;
while (*lpString != 0) {
if (printWhitespace) std::cout << " ";
std::cout << *lpString++;
printWhitespace = true;
}
std::cout << ")";
}
}
BOOL CALLBACK enumRsrcLang(
_In_opt_ HMODULE hModule,
_In_ LPCWSTR lpType,
_In_ LPCWSTR lpName,
_In_ WORD wLanguage,
_In_ LONG_PTR lParam
) {
std::cout
<< " Lang: "
<< wLanguage
<< std::hex
<< " (0x" << wLanguage << ")"
<< std::dec
<< std::endl
;
return TRUE;
}
BOOL CALLBACK enumRsrcName(
_In_opt_ HMODULE hModule,
_In_ LPCWSTR lpType,
_In_ LPWSTR lpName,
_In_ LONG_PTR lParam
) {
std::cout << " Name: ";
printResourceString(lpName);
std::cout << " {" << std::endl;
if (EnumResourceLanguagesExW(hModule, lpType, lpName, enumRsrcLang, lParam, 0, 0) == FALSE) {
printLastError(" Could not enumerate resource languages");
}
std::cout << " }" << std::endl;
return TRUE;
}
BOOL CALLBACK enumRsrcType(
_In_opt_ HMODULE hModule,
_In_ LPWSTR lpType,
_In_ LONG_PTR lParam
) {
std::cout << "Type: ";
printResourceString(lpType);
std::cout << " {" << std::endl;
if (EnumResourceNamesExW(hModule, lpType, enumRsrcName, lParam, 0, 0) == FALSE) {
printLastError(" Could not enumerate resource names");
}
std::cout << "}" << std::endl;
return TRUE;
}
int main() {
if (EnumResourceTypesExW(nullptr, enumRsrcType, 0, 0, 0) == FALSE) {
printLastError("Could not enumerate resource types");
}
return 0;
}
This results in the following output:
// Output for step 2: MyIcon
Type: 3 {
Name: 1 {
Lang: 1033 (0x409)
}
}
Type: 14 {
Name: MYICON (4d 59 49 43 4f 4e) {
Lang: 409 (0x409)
}
}
// Output for step 3: MyIcÖn
Type: 3 {
Name: 1 {
Lang: 1033 (0x409)
}
}
Type: 14 {
Name: MYICÖN (4d 59 49 43 d6 4e) {
Lang: 409 (0x409)
}
}
// Output for step 4: MyIcön
Type: 3 {
Name: 1 {
Lang: 1033 (0x409)
}
}
Type: 14 {
Name: MYICöN (4d 59 49 43 f6 4e) {
Could not enumerate resource languages: ERROR_RESOURCE_NAME_NOT_FOUND
}
}
Running this modified code produces output that reflects the handling of resource names and shows issues similar to those observed previously, particularly with MyIcön
, which results in an ERROR_RESOURCE_NAME_NOT_FOUND
error during enumeration of resource languages.