Message Versioning
Message Versioning focuses on the data that comprises the messages created and consumed by a service. For the purposes of this paper, Message Versioning will include both versioning (forwards and backwards compatibility) and extensibility. Versioning permanently expands the base vocabulary for future implementations while extensions temporarily extend the base vocabulary for a specific implementation.
Most message versioning requirements will fall into one of the following categories (assuming the use of doc-literal services):
· New Documents – Most industry-standard XML vocabularies define a set of “core” document types such as Purchase Orders, Advanced Ship Notices and Invoices. Since it is impossible to define every possible type of document a given organization may need, the vocabulary must be extensible, enabling new documents and data structures to be added. For example, RosettaNet has been steadily adding support for new Partner Interface Processes (PIPs) since 1999.
· Extending Existing Data Constructs – This approach requires extending existing constructs to better reflect the needs of the organization. For example, a Common Alerting Protocol Amber Alert can add jurisdictional information for alerts spanning multiple states.
· Message Enhancements – New versions of industry-standard vocabularies are frequently issued to add new messages and schemas or change the underlying extensibility model.
Message Versioning builds upon general XML versioning and extensibility guidelines. There are several common techniques for XML versioning and extensibility techniques available:
· Namespaces
· Extension Elements
· Custom version attributes
Namespaces and Extension Elements
XML Namespaces serve two main purposes:
· Ensure uniqueness of XML element and attribute names (eliminate name collisions)
· Provide a URI used to specify the language associated with a given XML element or attribute.
XML Namespaces are frequently used to communicate versioning information for the associated vocabulary. For example, the snippet of markup below illustrates how the Global Justice XML Data Model appends versioning information to the end of the namespace URI, indicating that we are working with version 3.0 of the GJXDM schema:
<?xml version="1.0" encoding="UTF-8"?>
<xsd:schema
targetNamespace="https://www.it.ojp.gov/jxdm/reg/1.1"
xmlns:xsd= "https://www.w3.org/2001/XMLSchema"
xmlns:reg= "https://www.it.ojp.gov/jxdm/reg/1.1"
xmlns:jxdm= "https://www.it.ojp.gov/jxdm/3.0" >
<xsd:import
namespace= "https://www.it.ojp.gov/jxdm/3.0"
schemaLocation= "../jxdm/3.0/jxdm.xsd" />
<xsd:element name="reg:Registration">
. . .
</xsd:schema>
There are several options for versioning or extending a schema using XML Namespaces:
1. Use a new XML Namespace for major version releases – (An example of this option appears in the example above.) Versioning the targetNamespace is a breaking change – this means that XML instances will not validate successfully until they are changed to use the new targetNamespace value. Since this is a breaking change it should be used for major versioning changes in the underlying vocabulary. Note also that if you're using serialization, the serializer will generate the appropriate types for you based on the namespace.
2. Keep XML Namespace values constant and add an XML Schema version attribute - The XML Schema specification allows an optional version attribute on the schema declaration. The advantage of this approach is that it is easy to implement and is fully supported by the XML Schema standard. The impact upon XML instances is fairly minimal since the namespace remains unchanged. There are two disadvantages with this approach:
o XML schema validation tools are not required to validate instances using the version attribute – the attribute is provided purely for documentation purposes and is not enforceable by XML parsers.
o Since XML parsers are not required to validate using the version attribute, additional custom processing (over and above parsing and validation) is required to ensure that the expected schema version(s) are being referenced by the instance.
3. Keep XML Namespace values constant and add a special element for grouping custom extensions – This approach wraps extensions to the underlying vocabulary within a special “extension” element. This technique is favored by several industry-standard schemas. For example, the Open Application Group’s Business Object Documents (OAG BODs) include a <userarea> element to add custom information that may not be part of the base vocabulary. This approach enables us to reuse the same namespace, minimizing the impact upon users of the schema while maximizing the extensibility of the schema constructs (schemas can be both forward and backward compatible). There are two disadvantages to this approach:
o This approach introduces significantly higher levels of complexity into the schema (anyone who has worked with OAGIS can attest to the complexity of the schemas).
o This approach is fairly limited – there is no way to implement multiple extensions across different portions of the XML instance since all extensions must be grouped within the extension “wrapper”.
This third approach may be a bit confusing since it blurs the line between versioning and extensions. Additional guidelines for designing extensible schemas are presented below (see the “Designing for Extensibility”).
Custom Version attributes
Since XML parsers are not required to validate instances using version developers may decide to implement their own representation of version, enabling the parser to include it in the validation process. Developers adopting this technique typically make their versioning attribute a fixed, required value for identifying a specific schema version. In the simple schema below we see that instances must set the xsdVersion attribute on the root Demo element to “1.0” or they will be invalid.
<xs:schema xmlns="https://www.contoso.org"
targetNamespace="https://www.contoso.org"
xmlns:xs="https://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified"
attributeFormDefault="unqualified">
<xs:element name="Demo">
<xs:complexType>
...
<xs:attribute name="xsdVersion"
type="xs:decimal" use="required"
fixed="1.0"/>
</xs:complexType>
</xs:element>
The advantage of this approach is that is relatively easy to implement and enables us to work around the fact that the schema version attribute is not included in the validation process. There are three disadvantages to this approach
· An XML instance will be unable to use multiple versions of a schema representation because versioning occurs at the schema’s root.
· This approach assumes that all instances will be validated, causing the xsdVersion attribute to be set to “1.0”. This is an invalid assumption – some organizations may decide to turn off schema validation for performance reasons. Organizations deciding to turn off schema validation may experience unexpected issues (the least of which may include accepting invalid XML instances). Organizations that decide to turn off schema validation must be fully aware of the consequences of this action, especially since schemas are capable of far more than simple structure and content validation.
· The XML serialization process in the .NET framework (1.x) is highly optimized and will not serialize attribute values unless they differ from the default value specified in the schema. This means that the value of xsdVersion will not appear in the PSVI (post schema validation infoset) unless a service consumer sets it to a value other than “1.0” (which invalidates our need for the attribute in the first place).
Given the options presented above, the best choice for communicating major version releases is to use the targetNamespace of the schema.
Contract-First to Design a Flexible Service Interface
As noted above, there is a significant difference between versioning and extending a XML schema. Versioning implies that a permanent change has been implemented while extensions modify the vocabulary for one or more specific implementations.
Service contracts should be designed with the assumption that once published, they cannot be modified - this approach forces developers to build flexibility into their schema designs. One way to ensure that the service contract remains flexible is to adopt a Contract-First approach to service development. With Contract-First you define your service contract before developing the service itself. The contract can then be used to generate the actual service code. Contract-First is a proven approach for reducing the barriers to interoperability since it builds upon decades of experience (CORBA, COM and DCE all used interface languages and encouraged a Contract-First approach to development). Many development environments have added simple support for Contract-First while tools such as thinktecture’s WSCF and the GotDotNet XSD Object Code Generator help to further automate this process. Regardless of the development approach you utilize for service development there is no question that service contracts must be designed in an extensible manner to minimize disruptive versioning changes.
Judicious use of <xsd:any>
The XML Schema standard introduces <xsd:any> as a wildcarding element. <xsd:any> enables schemas to be extended in a well-defined manner. <xsd:any> includes a namespace attribute that either constrains or extends the range of elements that might appear within the wildcard. The namespace attribute can be set to any of the following:
· ##any enables the use of elements from any Namespace to extend the schema
· ##targetnamespace restricts wildcards to the elements that appear within the targetNamespace
· ##other makes it illegal to extend the schema using elements from the targetNamespace
The processContents attribute dictates how schema extensions should be validated by the parser:
· strict requires the parser to validate all schema extensions
· skip turns off validation for schema extensions
· lax validates elements from supported namespaces and ignores unknown or unexpected elements (most Web services specifications use lax)
The example below illustrates the use of <xsd:any> to enable an extensible definition of name:
<xs:complexType name=“name”>
<xs:sequence>
<xs:element name=”first” type=”xs:string”/>
<xs:element name=”last” type=”xs:string”/>
<xs:any namespace=”##any”
processContents=”lax”
minOccurs=”0”
maxOccurs=”unbounded”/>
</xs:sequence>
</xs:complexType>
The example above would enable the XML instance to add additional constructs after the last name (e.g. John Smith III, John Smith PhD, etc) while remaining valid based on the schema definition.
While <xsd:any> can be used to extend the schema, there are some general guidelines to keep in mind when designing extensible schemas with <xsd:any> :
Abuse of <xsd:any> - Overuse of <xsd:any> may negate the value of using XML Schema - if everything is a wild card you might as well revert to using DTDs. Add <xsd:any> to data structures that will require extensibility (e.g. name, address, others). Avoid simply adding <xsd:any> to the end of your schemas to ensure extensibility. <xsd:any> can help delay versioning but should not be treated as a substitute for it. Schemas should be designed for extensibility, not to avoid versioning.
Non-determinism – XML was designed to reject ambiguous content models. Non-determinism occurs when a parser is unable to determine the intent of the schema designer. For example, “ ((a, b) | (a, c))” is an example of an ambiguous content model since we are unable to determine if the value following “a” will be “b” or “c”. The XML Schema specification also places several non-deterministic constraints upon the use of <xsd:any> . For example, <xsd:any> can only be used after a required element. Readers interested in learning more about non-determinism’s impact upon XML are encouraged to review Appendix E of the W3C XML 1.0 Third Edition Technical Recommendation.
Regardless of the extension mechanism chosen, developers must ensure that all schema extensions utilize unique namespaces since the extensions are not part of the base vocabulary.