แชร์ผ่าน


Output Text in Unicode

According to C standard, it only supports output text in MBCS (Multi-Byte Character String):

n1124.pdf, 7.19.3/12

The wide character output functions convert wide characters to multibyte characters and write them to the stream as if they were written by successive calls to the fputwc function. Each conversion occurs as if by a call to the wcrtomb function, with the conversion state described by the stream’s own mbstate_t object. The byte output functions write characters to the stream as if by successive calls to the fputc function.

However, MBCS doesn’t support mixing characters in different code page. For example, you can’t use both Chinese and Japanese.

VC provides extension to allow you to output text in Unicode:

#include <cstdio>
#include <locale.h>

#include <io.h>
#include <fcntl.h>

void OutputMBCS(FILE *f)
{
    // ".936" is the code page for Simplified Chinese
    // However, you can't use ".1200" (code page for Unicode) to output text in Unicode
    setlocale(LC_CTYPE, ".936");
    // My name in Chinese: "范翔"
    fwprintf(f, L"%s", L"\x8303\x7FD4");

    // The text in the file is encoded in GBK
}

void OutputUnicode(FILE *f)
{
    _setmode(_fileno(f), _O_U16TEXT);
    fwprintf(f, L"%s", L"\x8303\x7FD4");

    // The text in the file is encoded in Unicode
}

For more information, please check the following post: https://blogs.msdn.com/michkap/archive/2008/03/18/8306597.aspx