2.2.78 Encoded-String

Encoded-String is a special data type that is the only means of representing strings.

   Encoded-String = Encoded-String-Flag *Character Null
   Encoded-String-Flag = OCTET
   Character = AnsiCharacter / UnicodeCharacter
   Null = Character
   AnsiCharacter = OCTET
   UnicodeCharacter = 2OCTET

The Encoded-String string data type is encoded using an encoding flag that consists of one octet followed by a sequence of character items using one of two formats followed by a null terminator.

The Encoded-String-Flag is set to 0x01 if the sequence of characters that follows consists of UTF-16 characters (as specified in [UNICODE]) followed by a UTF-16 null terminator.

For optimization reasons, the implementation MUST compress the UTF-16 encoding. If all the characters in the string have values (as specified in [UNICODE]) that are from 0 to 255, the string MUST be compressed. The compression is done by representing each character as a single OCTET with its Unicode value. That is, for each Unicode character, only the lower-order byte is included in the output. A terminating null character MUST be represented by a single OCTET. When the string is compressed, Encoded-String-Flag is set to 0x00. This is distinct from UTF-8, which might contain multiple-byte encodings for single characters.

When the string contains characters (as specified in [UNICODE]) outside this range, this optimization MUST NOT be used. For example, the character K (which is UTF+004B) follows.

0

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

0

0

0

0

0

0

0

0

0

1

0

0

1

0

1

1

The upper 8 bits are all zero bits. If all the characters for a string have this quality, the string MUST be reduced to its 8-bit equivalent on a character-by-character basis.

This compression technique applies to characters in U+0000 through U+00FF and MUST be accompanied by the appropriate Encoded-String-Flag value at the beginning of the encoding.

For any specified CIM object encoding as a whole, the individual strings might or might not use the optimization, depending precisely on which characters are present in the string.