Partager via


3.1.5.4.6 IDNA2008+UTS46 NormalizeForIdna

NormalizeForIdna prepares the input string for encoding, using the mapping/normalization rules provided by IDNA2008+UTS46 (IDNA2008 with [TR46] applied).<16>

 COMMENT NormalizeForIdna2008
 COMMENT  On Entry:  SourceString – Unicode String to prepare for IDNA
 COMMENT             Flags        - Bit flags to control behavior
 COMMENT                            of IDN validation
 COMMENT
 COMMENT  IDN_ALLOW_UNASSIGNED:     During validation, allow unicode
 COMMENT                            code points that are not assigned.   
 COMMENT
 COMMENT  On Exit:  Punycode      - String containing the Punycode ASCII range
 COMMENT                            form of the input
 PROCEDURE NormalizeForIdna2008 (IN SourceString : Unicode String,
                                 IN Flags: 32 bit integer,
                                 OUT OutputString : Unicode String)
 COMMENT Mapping is done per the tables published by Unicode by following
 COMMENT RFC5892 as modified by UTS#46 section 2 "Unicode IDNA Compatibility Processing"
 COMMENT Appendix A of RFC5892 is NOT applied.
 COMMENT Effectively this mapping is merely applying the latest IdnaMappingTable.txt
 COMMENT mappings, including the "deviation" mappings from http://www.unicode.org/Public/idna/
 COMMENT 
 COMMENT Apply UTS#46 Section 4 steps 1 & 2 to the string with the "Transitional Processing"
 COMMENT option for the four "deviation" characters.  Steps 3 and 4 are done by the caller.
 COMMENT http://www.unicode.org/reports/tr46/#Processing 
 OPEN mapping FILE "http://www.unicode.org/Public/idna/6.3.0/IdnaMappingTable.txt"
 SET OutputString TO "" 
 FOREACH character IN SourceString
     FIND RECORD data IN mapping WHERE LINE CONTAINS character
     IF (data IS EMPTY) THEN
         IF (IDN_ALLOW_UNASSIGNED bit IS NOT ON in Flags) THEN
             RETURN ERROR
         ELSE
             APPEND character TO OutputString
         ENDIF
     ELSE
         SWITCH (data FIELD statusValue)
             CASE "valid"
             CASE "disallowed_STD3_valid"
                 BREAK
             CASE "ignored"
                 SET character TO ""
                 BREAK
             CASE "mapped"
             CASE "disallowed_STD3_valid"
             CASE "deviation"
                 SET character TO data FIELD mappingValue
                 BREAK
         ENDSWITCH
         APPEND character TO OuptutString
     ENDIF
 ENDFOREACH
 RETURN OutputString