Typed XML programmer -- Where do you want to go tomorrow?
This post starts a series (of blog posts) on what I would like to call “Typed XML programming”. The overall goal of the series is to engage in a discussion on requirements, scenarios and priorities around typed XML programming. The first post sets up some real basics, poses some questions, and hopefully triggers appetite in getting back to this thread.
What is typed XML programming anyway?
In elevator speech, I mean by that “XML programming in mainstream OO languages like C#, Java and VB while leveraging XML schemas as part of the XML programming model”. I am trying to scope out XSLT, XQuery and other DSLs in the present series, if you don’t mind. Otherwise, I would like to go for a broad definition of XML programming including scenarios such as (i) consuming XML as input for an application; (ii) producing XML as output of an application; (iii) operating on in memory representations of XML; (iv) streaming over XML; (v) accessing XML in the database, and what have you.
Let’s start with ‘untyped’ XML programming. Here is an archetypal C# function that takes an (in-memory) XML tree with purchase orders and calculates the total over all order items (i.e., sum up price times quantity for all items):
// Use your favorite XML API (such as DOM or … XLinq in my case)
public static double GetTotalByZip(XElement os, int zip)
{
double total = 0.0;
foreach (XElement o in os.Elements("order"))
if ((int)o.Attribute("zip") == zip)
foreach (XElement i in o.Elements("item"))
total += (double)i.Element("price")
* (int)i.Element("quantity");
return total;
}
It is somewhat discriminatory to label the above code as ‘untyped’ since the mere use of the XML API is still subjected to static type checking; also, the look-up of elements and attributes is sort of dynamically checked. Likewise, I would like to avoid restricting ‘typed’ XML programming to a narrow notion of static typing. Instead, XML types (aka XML schemas) may contribute to the XML programming model in various ways such as validation protocols, precondition checking, exception handling, intellisense, tool tips and others. For now, let me just do the most obvious thing -- assume a C# object model for the kind of elements in the purchase-order example. (The object model may have been derived from an XML schema by a code generator like xsd.exe.) Based on such an object model, the above ‘untyped’ XLinq code is transcribed to a ‘typed’ C# function as follows:
// We presume object types for order collections, orders and order items.
public static double GetTotalByZip(orders os, int zip)
{
double total = 0.0;
foreach (order o in os.order)
if (o.zip == zip)
foreach (item i in o.item)
total += i.price * i.quantity;
return total;
}
For clarity, let’s show the diff on the untyped vs. typed versions.
I strike through ‘untyped slack’:
public static double GetTotalByZip(XElement orders os, int zip)
{
double total = 0.0;
foreach (XElement order o in os.Elements("order"))
if ((int)o.Attribute("zip") == zip)
foreach (XElement item i in o.Elements("item"))
total += (double)i.Element("price")
* (int)i.Element("quantity");
return total;
}
So in this instance of typed XML programming, we managed to get rid of all casts, all string-encoded element names and attributes, and we might have enjoyed intellisense and tool tips as we typed in the code. Furthermore, type checking prevented us from several kinds of typos, but we had to type in considerably less code anyhow. Finally, we also enjoy the object types at run-time helping us in debugging and dispatching efforts. It sounds like typed XML programming is a good idea, but I am of course aware of contrary opinions (and I promise to get back to them later in the series). Let me say that typed XML programming gets a lot of attention. For instance, check out the sheer number of technologies for XML data binding and research efforts on programming languages for typed XML programming (cf. Comega, XJ, Xtatic, etc.).
Requirements? Scenarios? Priorities?
I haven’t provided much context yet for a deep discussion, but let’s assume that readers of this blog have a certain understanding of “Typed XML programming -- today”. So what I would like to do now is pose some questions, which can be summarized as follows: “Typed XML programmer -- Where do you want to go tomorrow? ”
- Do we expect OO developers to understand XML types?
- Is XML Schema the right basis for typed XML programming?
- What are the MoSCoW requirements for typed XML programming?
- What are the key weaknesses of current XML data-binding technologies?
- What are the expectations or reservations regarding XML/OO `language cocktails’?
- How much do we care about X/O mapping when compared to O/R mapping?
- How do we (programmatically or otherwise) mediate between given XML and OO types?
- What other questions should have been posed here?
In a few days, I am getting back to you.
My plan is to mumble a bit about “Typed XML programming -- today”.
Comments
- Anonymous
July 20, 2006
Looking forward to your future posts on the subject. While I'm mostly an o/r guy I'm also very interested in what the typed programming story will be for XML in the future. - Anonymous
July 20, 2006
You've seen Ralf Lämmel's post starting a series about our research and prototyping efforts... - Anonymous
July 21, 2006
Ralf,
I am also interested in this topic. I've been building
adaptable applications based solely on XML using
a native XML db. I often take XML structures and
wrap them in an object to get the "typed" effect.
Using a native XML db for storage has save me
lots of "XML/DB mapping" time and might be another
topic of discussion.
Dan - Anonymous
August 02, 2006
The comment has been removed - Anonymous
August 05, 2006
This post continues the series on “Typed XML programmer -- Where do you want to go tomorrow?”. This time,... - Anonymous
November 12, 2006
As I am working on the 3 rd post, I thought I should offer an index as follows: Typed XML programmer