To Trust, or Not to Trust?
Validation of an XML document against an Xml Schema guarantees that the structure and content of the xml conforms to the types defined in the schema. Does this mean that we automatically elevate the trust level of a document that has passed schema validation? Can we use schema validation as a security layer to our application?
The general recommendation is that validation of an xml document should not preclude the need for secure coding practices in the application that consumes the validated data. That being the case, we know of applications where length facets are used to ensure that a input parameter is not longer than the specified length, pattern facets are used to verify that the input does not pose the risk of SQL/Command injection etc.
W3C Xml Schema is a complicated specification open to a lot of interpretation and we have not reached a stage yet where all the schema processors are 100% compatible. Consider the case where the regular expression implementation in a particular schema processor is different from that specified in the XSD specification. Suddenly, the pattern facet that is supposed to protect the application from injection attacks is no longer safe.
If you are one among the people who answered yes to the questions at the beginning of the article, read on for ways to tighten the security of a validation episode using the XmlSchemaValidationFlags in the System.Xml.Schema namespace in the .NET Framework 2.0
XmlSchemaValidationFlags Explained
XmlSchemaValidationFlags was introduced in .NET Framework 2.0 in order to mitigate security threats and improve interoperability while performing schema validation using the validating XmlReader or XmlSchemaValidator.
The enumeration has the following values:
Enum Value |
Description |
XmlReaderSettings Default |
None |
Identity constraints, Schema Location hints, Inline schemas and validation warnings will all be ignored |
|
ProcessIdentityConstraints |
Perform validation for xs:ID, xs:IDREF, xs:key, xs:keyref, xs:unique |
Yes |
ProcessInlineSchema |
Load any inline schemas in the xml instance being validated and add the schema for validation of subsequent xml nodes |
No |
ProcessSchemaLocation |
Load schemas by following the location hints specified in xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes and use the schemas for validation of subsequent xml nodes |
No |
ReportValidationWarnings |
Report any warnings encountered during the validation of the xml instance |
No |
AllowXmlAttributes |
Allow xml:* attributes even if they are not defined in the schema. The attributes will be validated based on their data type |
Yes |
Security Implication of XmlSchemaValidationFlags
· DO TURN ON the ReportValidationWarnings flag
By default this flag is not turned on while creating a validating XmlReader using the XmlReaderSettings. (This was done so that users can perform partial validation of an xml instance without having to deal with a large number of warnings for the portions that don’t have a schema. For eg: Validating a WordML document with user content against the user’s schema. Another reason was to improve performance as every warning entails creation of an exception object)
Consider the following example of an order schema and an instance order.xml
<xsd:schema xmlns:xsd="https://www.w3.org/2001/XMLSchema" targetNamespace="https://tempuri/Orders.org" xmlns="https://tempuri/Orders.org" elementFormDefault="qualified">
<xsd:element name="order">
<xsd:complexType >
<xsd:sequence>
<xsd:element name="orderid" type="xsd:int"/>
<xsd:element name="item" type="xsd:string" maxOccurs="unbounded"/>
<xsd:element name="address" type="xsd:string"/>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
</xsd:schema>
<order xmlns="https://tempuri/Order.org">
<orderid>A100</orderid>
<item>umbrella</item>
<address>1234 wallaby way</address>
</order>
The <orderid> element is invalid according to the schema (A100 is not a valid xsd:int) but validation using a validating XmlReader with the default settings returns without any errors. This happens because the namespace in the xml and the namespace in the schema do not match (https://tempuri/Order.org Vs https://tempuri/Orders.org) and strict schema validation occurs only after finding the schema definition for an element whose name AND namespace match a definition in the schema (Schema-Validity Assessment (Element))
In this case, it is only a warning that schema information could not be found due to the namespace mismatch and since warnings are turned OFF by default, the user sees no evidence that the validation did not happen. If the flag is turned ON, the user should see the following warning:
Could not find schema information for the element 'https://tempuri/Order.org:order'.
An Error Occurred at: file:///E:/bugrepro/order.xml, (1,2)
Note: Warnings will be reported when this flag is turned ON AND a validation event handler is hooked up (to XmlReaderSettings, XmlDocument or XmlSchemaValidator)
· DO NOT TURN ON the ProcessSchemaLocation flag
If this flag is turned ON, Schema Location hints in the xml document (xsi:schemaLocation and xsi:noNamespaceSchemaLocation attributes) are followed by the validation engine using the default XmlUrlResolver (unless the XmlResolver property is specifically set to NULL or a secure resolver in which case it takes precedence)
The default resolver does not protect against cross-zone re-direction and adding schemas at validation time by the instance document might change validation outcome by adding new types, redefining existing types etc.
· DO NOT TURN ON the ProcessInlineSchema flag
In addition to allowing new types and redefining existing types by way of the schema allowed inline in the xml document, this will also pose a threat to users who are dependent on strict validation to map their XML into objects since any element can now contain a whole schema as its child node and might cause unexpected errors in the X-O mapping.
· MAY TURN ON the AllowXmlAttributes flag
If allowing attributes from the xml namespace (on any or all elements in the instance even though not specifically allowed by the schema) will not pose a risk to your application.
· MAY TURN ON the ProcessIdentityConstraints flag
If processing of xsd:key. xsd:keyref, xs:ID, xs:IDREF is important to your application and you have determined that the scope of the key/keyref is not such that it might cause a Denial of Service attack.
All of the above are merely guidelines for a secure validation episode using the System.Xml.Schema namespace. The flags you choose are greatly dependent on your application needs but keep in mind the security implications of validated data the next time you are tempted to completely trust such data.
Hope you enjoyed reading this and I greatly appreciate your feedback.
-Priya Lakshminarayanan
Comments
Anonymous
April 02, 2007
PingBack from http://daysofourlivesblogs.info/to-trust-or-not-to-trust/Anonymous
April 17, 2007
hi i ran into an issue where i had created a small, but beloved program to log onto a site, get some data, and then log off. how hard can that be, right? well, it seemed like it took me a long time to get it right but i finally got it working - so long as (and this is the point of my post), apparently, the OS is XP SP1 or lower and IE 6 SP1 or lower. the exact particulars of the code are proprietary, but here is a good replica to give you an idea of the issue: Dim objHTTP As Object Set objHTTP = CreateObject("MSXML2.XMLHTTP.6.0") usrnme = "guest" pwrd = "guest" 'get login form contents =========== logfl = "https://thelinkfromthewebform.com" objHTTP.Open "POST", logfl, False objHTTP.setRequestHeader "Keep-Alive", "300" objHTTP.setRequestHeader "Connection", "Keep-Alive" objHTTP.setRequestHeader "Content-Type", "application/x-www-form-urlencoded" submurl = "formdetails=etcectect&usrname=" & usrnme & "&Passwd=" & pwrd & "null=Sign+in" objHTTP.send (submurl) strRequest = objHTTP.responseText objHTTP.Open "GET", "http://thenextpageatthe siteafterlogin.com", False 'and this is where it keeps bugging out since i installed XP SP2 - note: all of my clients who have XP SP2 and higher report the same issue objHTTP.send this code works perfectly with XP SP1 or lower, but for some d*** reason it bugs out after the log in. i have uploaded and installed all the msxml stuff (my download folder is overflowing), including msxml6 i have tried: CreateObject("Microsoft.XMLHTTP") CreateObject("MSXML2.XMLHTTP") CreateObject("MSXML2.XMLHTTP.2.0") CreateObject("MSXML2.XMLHTTP.2.6") CreateObject("MSXML2.XMLHTTP.4.0") CreateObject("MSXML2.XMLHTTP.5.0") CreateObject("MSXML2.XMLHTTP.6.0") and i get the same result i tried all the various combinations in the IE6 (now) SP2 in the Advanvced section so my previously working program with the exact same code can get back to working like it did before, of SSL2 SSL3 TLS1 but the thing is in the program sans XP SP2 (and higher) i never had to make any adjustments at all to any of the IE6 settings - it just worked. i am not a code guru, but i do recall i worked slvishly in that program to get it hust right, and it was very sweet to have it working so well for so long. not being a code curu, my thing is not getting into the deep end on code. i build stuff to serve a purpose, but once it is working i need to get on with other things. so it is REALLY a d-r-a-g when something that was workng FINE before i upgraded (haha) to SP2 would now be broken for the sake of SP2, leaving me with the dreaded and now ongoing task of getting the very simple code to work again. i am posting here because you seem like you know all there is to know about this stuff and i would be really grateful if you happened to know right what the new syntax is, or what the tweak is so my code can get back to doing what it was doing before, i.e. simply logging onto a site, moving to a page in the secure area, getting some data, and then logging off. why o why can MS leave a good thing alone?!?!? this is how it looks from someone in the outside world. we know enough to get something to work, but when uncle bill comes along and toys with the foundation, well, that is what brings 'us' out of the woodwork - unfortunately - to post to a place like this. i thought my day were over having to ask questions! anyway, if you could tell me how to get the code to fire using XP SP2 - in other words IE SP2 and higher, i'd be grateful