Encoding Cascading Style Sheet Strings
RV here...
Cascading Style Sheets provide developers ways to change the UI theme of a website and this provides many opportunities for malicious users to change the UI if the application uses dynamic data inside style tags or in HTML style attributes. Additionally keywords like expression can be used to run java script resulting in Cross Site Scripting Attacks. Although IE8 Blocks the use of expression keyword, it is still allowed under the IE7 mode.
To ensure that input is not executed or interpreted by browser as style elements you as a developer need to escape or encode the input in proper format. The CSS character encode sequence consists of a backslash character (\) followed by between one and six hexadecimal digits that represent a character code from the ISO 10646 standard (which is equivalent to Unicode, but with some exceptions). Any character other than a hexadecimal digit will terminate the escape sequence. If a character following the escape sequence is also a valid hexadecimal digit then it must either include six digits in the escape, or use a whitespace character to terminate the escape. Unfortunately there is not a lot of information about this in the CSS specification, there is a vague mention of this in appendix portion of CSS 2.1 specification.
We have also discovered in our tests that UTF-8 does not produce a valid byte sequence for higher Unicode characters for encoding. UTF-16 big endian encoding provides proper byte order for encoding. In the next version of Web Protection Library we have added a new method called Encoder.CssEncode which provides the above mentioned CSS encoding using the same comprehensive Anti-XSS Library whitelist. The following are some examples of proper CSS encoding.
Input | Encoded Input |
A | \0041 |
(space) | \0032 |
ɐ | \0250 |
< | \003c |
> | \003e |
Keep checking the blog for more posts on new features in WPL.
Thanks
RV