Compartilhar via


3.1.5.1.1.2 Pseudocode for Mapping a UTF-16 String to a Codepage String

 COMMENT  This algorithm maps a Unicode string encoded in UTF-16 to a 
 string in the specified ANSI codepage. The supported ANSI codepages 
 are limited to those that can be set as system codepage.  
  
 It requires the following externally specified values:
  
 1) CodePage: An integer value to represent an ANSI codepage value.
  
    If CodePage value is CP_ACP (0), use the system default ANSI codepage from 
    the OS.
    If CodePage value is CP_OEMCP (1), use the sysstem default OEM codepage from 
    the OS.
  
 2) UnicodeString: A string encoded in UTF-16. Every Unicode code point 
    is an unsigned 16-bit ("WORD") value. A surrogate pair is not 
    supported in this algorithm.
  
 3) UnicodeStringLength: The string length in 16-bit ("WORD") unit for 
    UnicodeString. When UnicodeStringLength is 0, the length is 
    decided by counting from the beginning of the string to a NULL 
    character (Unicode value U+0000), including the null character.
  
 4) MultiByteString: A string encoded in ANSI codepage. Every 
    character can be an 8-bit (byte) unsigned value or two 8-bit
    unsigned values.
  
 5) MultiByteStringLength: The length in bytes, including
    the byte for NULL terminator. When MultiByteStringLength is 0, 
    the MultiByteString value will not be used in this algorithm.
    Instead, the length of the result string in ANSI codepage will be
    returned.
  
 6) lpDefaultChar
    Optional. Point to the byte to use if a character cannot be represented in  
    the specified codepage. The application sets this parameter to NULL if 
    the function is to use a system default value. The common default value is
    0x3f, which is the ASCII value for the question mark.
  
  
 PROCEDURE WideCharToMultiByteFromCodepageDataFile
  
 IF CodePage is CP_ACP THEN
     COMMENT Windows operating system keeps a systemwide value of 
             default ANSI system codepage. It is used to provide a default
     COMMENT system codepage to be used by legacy ANSI application.
             
     SET CodePage to the default ANSI system codepage from the Windows 
             operating system.
 ELSE IF CodePage is CP_OEMCP THEN
     COMMENT Windows keeps a systemwide value of 
             default OEM system codepage. It is used to provide a default
     COMMENT system codepage to be used by legacy console application.
             
     SET CodePage to the default OEM system codepage from Windows. 
  
 ENDIF
 
 IF CodePage is CP_UTF8 THEN   
     CALL Utf8ConversionAlgorithm             
     COMMENT For UTF-8 use the algorithm in 3.1.5.1.6
     RETURN
 
 
 ENDIF
 
 
  
 IF UnicodeStringLength is 0 THEN
     COMPUTE UnicodeStringLength as the string length in 16-bit units 
             of UnicodeString as a NULL-terminated string, including
             NULL terminator.
 ENDIF
  
 IF MultiByteStringLength is 0 THEN
     SET IsCountingOnly to True
 ELSE
     SET IsCountingOnly to False
 ENDIF
  
  
 SET ResultMultiByteLength to 0
 SET CodePageFileName to the concatenation of strings "Bestfit", 
     CodePage as a string, and ".txt"
  
 IF lpDefaultChar is null THEN
     COMMENT No default char is specified by the caller. Read the default
     COMMENT char from CPINFO in the data file
  
     OPEN SECTION CharacterInfo where section name is CPINFO 
     from file with the name of CodePageFileName
     SET lpDefaultChar to CharacterInfo.Field3
 ENDIF
  
  
 OPEN SECTION WideCharMapping where section name is WCTABLE from file 
     with the name of CodePageFileName
  
 FOR each Unicode codepoint UnicodeChar in UnicodeString
      SELECT MappingData from WideCharMapping
             where field 1 matches UnicodeChar
      IF MappingData is null THEN
          COMMENT There is no mapping for this Unicode character, use
          COMMENT the default character
          IF IsCountingOnly is False THEN
              SET MultiByteString[ResultMultiByteLength]
                  to lpDefaultChar
          ENDIF
          INCREMENT ResultMultiByteLength
          CONTINUE FOR loop
      ENDIF
  
      SET MultiByteResult to MappingData.Field2
  
      IF MultiByteResult is less than 256 THEN
           COMMENT This is a single byte result
           IF IsCountingOnly is True THEN
                INCREMENT ResultMultiByteLength
           ELSE
                SET MultiByteString[ResultMultiByteLength]
                    to MultiByteResult
                INCREMENT ResultMultiByteLength
           ENDIF
      ELSE   
           COMMENT This is a double byte result
           IF IsCountingOnly is True THEN
                COMPUTE ResultMultiByteLength as 
                        ResultMultiByteLength added by 2
           ELSE
                SET MultiByteString[ResultMultiByteLength] to
                    MultiByteResult divided by 256
                INCREMENT ResultMultiByteLength
                SET MultiByteString[ResultMultiByteLength] to
                    the remainder of MultiByteResult divided by 256
                INCREMENT ResultMultiByteLength
           ENDIF
      ENDIF
 END FOR
  
 RETURN ResultMultiByteLength as a 32-bit unsigned integer