While reading a UTF-8 text file in a Unicode C++ build, the CStdioFile::ReadString method fails to read certain Unicode characters correctly. For example, “John W. Gates” Day, is read as âJohn W. Gatesâ Day.
In memory I see this:
0x000001B291ADC258 e2 00 80 00 9c 00 4a 00 6f 00 68 00 6e 00 20 00 â.€.œ.J.o.h.n. .
0x000001B291ADC268 57 00 2e 00 20 00 47 00 61 00 74 00 65 00 73 00 W... .G.a.t.e.s.
0x000001B291ADC278 e2 00 80 00 9d 00 20 00 44 00 61 00 79 00 2c 00 â.€... .D.a.y
I tried opening the file with CFile::typeUnicode flag instead of CFile::typeText, but that makes things worse because the text is converted to 8 bit ASCII which is completely unreadable in a Unicode environment.
After reading the text into an edit box, I can paste in the correct text, and it displays correctly, so the problem is strictly with reading the text file.
Am I doing something wrong or does this call just not support UTF-8?