2.2.78 Encoded-String
Encoded-String is a special data type that is the only means of representing strings.
-
Encoded-String = Encoded-String-Flag *Character Null Encoded-String-Flag = OCTET Character = AnsiCharacter / UnicodeCharacter Null = Character AnsiCharacter = OCTET UnicodeCharacter = 2OCTET
The Encoded-String string data type is encoded using an encoding flag that consists of one octet followed by a sequence of character items using one of two formats followed by a null terminator.
The Encoded-String-Flag is set to 0x01 if the sequence of characters that follows consists of UTF-16 characters (as specified in [UNICODE]) followed by a UTF-16 null terminator.
For optimization reasons, the implementation MUST compress the UTF-16 encoding. If all the characters in the string have values (as specified in [UNICODE]) that are from 0 to 255, the string MUST be compressed. The compression is done by representing each character as a single OCTET with its Unicode value. That is, for each Unicode character, only the lower-order byte is included in the output. A terminating null character MUST be represented by a single OCTET. When the string is compressed, Encoded-String-Flag is set to 0x00. This is distinct from UTF-8, which might contain multiple-byte encodings for single characters.
When the string contains characters (as specified in [UNICODE]) outside this range, this optimization MUST NOT be used. For example, the character K (which is UTF+004B) follows.
0 |
1 |
2 |
3 |
4 |
5 |
6 |
7 |
8 |
9 |
10 |
1 |
2 |
3 |
4 |
5 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
0 |
1 |
0 |
1 |
1 |
The upper 8 bits are all zero bits. If all the characters for a string have this quality, the string MUST be reduced to its 8-bit equivalent on a character-by-character basis.
This compression technique applies to characters in U+0000 through U+00FF and MUST be accompanied by the appropriate Encoded-String-Flag value at the beginning of the encoding.
For any specified CIM object encoding as a whole, the individual strings might or might not use the optimization, depending precisely on which characters are present in the string.