Compartir a través de


What LINQ to XML will NOT do

One of the worst pitfalls a design team can fall into is trying to do too much.   The principle is captured by the well known quote:

Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. - Antoine de Saint-Exupery

So, what has been taken away from  LINQ to XML (aka XLinq) in the pursuit of simplicity (if not perfection)? I'm in the process of documenting the "non-goals" for XLinq, and thought it would be good to share them and get some feedback.

I discussed the non-goal of replacing XSLT as a tool for processing unstructured documents and replacing XQuery as a database query language in a previous post.  Some other non-goals include:

  • Guaranteeing that an XLinq tree in memory meets the well-formedness constraints is a non-goal; this job is delegated to the XmlReader and XmlWriter. We plan to add more  in-memory well-formedness checking than is present in the May CTP release, but will not go so far as XOM does to "make no compromises on correctness".  Basically, we can't meet the goal of making XLinq as fast or faster than DOM for most cases if we perform the extensive character by character checking needed to guarantee well-formedness after every successful XLinq operation. If you need that guarantee, you can serialize an XLinq subtree to force the well-formedness check (and choose to bear the performance cost).
  • Support for XML 1.1 is a non-goal, but so is forbidding XLinq to be used with an XML 1.1 reader/writer.  Microsoft doesn't support XML 1.1 for reasons Michael Rys noted in mourning the day it became a recommendation, but the XLinq team doesn't necessarily think it is an abomination. If you like XLinq and need XML 1.1 support, you can write (or support a vendor or open source project that writes) a .NET XmlReader and XmlWriter implementation; the flip side of XLinq not enforcing the XML 1.0 well formedness rules is that it won't reject XML 1.1 content if some non-default reader/writer does not.
  • Anything beyond the barest minimum support for DTDs is a non-goal. XLinq (actually the XmlReader underneath) will read the DTD internal subset in an XML instance, expand any references to entities declared in the DTD,  and round-trip the DTD internal subset.  There is no object model for the DTD information, it is just saved as a string property.  You can change that string, but you are on your own as far as ensuring that the XML well-formedness constraints are preserved. Again, you can explicitly parse the value to check for well-formedness. 
  • Thus, it is a non-goal to preserve the syntactic fidelity of XML documents loaded / saved by XLinq.  For example,  a character entity reference defined in the DTD internal subset will NOT be re-entitized on save because XLinq (following the Infoset) has no “memory” of the XML entity that defined a particular Unicode character.
  • XSLT allows non well-formed results to be generated (e.g. HTML or text); XLinq does not offer this capability.
  • There is no guarantee that XLinq classes can be subclassed effectively, although there are currently no plans to seal them.  The recommended way for applications to add functionality to XLinq is by using the annotation feature  to add application-defined objects to XLinq tree objects.  In other words, internal experience with building on top of XLinq has shown that the aggregation design pattern works better than inheritance to extend its functionality.  This is not firm guidance, just advice that we have a real goal of supporting extensibility via annotations and a non-goal of supporting extensibility via inheritance.  This is, however, an area that is very much in flux and we would be particularly interested in hearing your use cases and experiences, e.g. in writing XLinq extensions that support one or more of these non-goals.

If these non-goals of XLinq don't meet your requirements, Microsoft offers alternatives that can be used separately or in conjunction with XLinq.  For example, DOM has an object model that supports entities and entity references.  The XmlReader and XmlWriter APIs are available to work with XML text in its raw form. In the Orcas release of LINQ to XML, there are plans to add "bridge" classes to allow XLinq to work in better harmony with the rest of System.Xml, e.g. to invoke XSLT over an XLinq tree to work around the non-goal of creating non well-formed output.

I'll update this post as other non-goals are identified.  We would, as usual, very much like to hear from prospective XLinq users as to whether these non-goals will clash with your goals or not.  It's possible we have taken away things that shouldn't have been taken away  ... or maybe there are still things we can remove in in the pursuit of API perfection

[Updated 7/1 with the XML 1.1 point and an elaboration on the extensibility point]

Comments

  • Anonymous
    July 01, 2006
    What about typed XLinq - is it still a goal?
  • Anonymous
    July 07, 2006
    I think laser focus beats out breadth of features every time.
  • Anonymous
    July 18, 2006
    The comment has been removed
  • Anonymous
    July 20, 2006
    On typed XLinq, it is definitely a goal but no commitments have been made.  Keep an eye on blogs.msdn.com/ralflammel ... he's the PM for the Typed XLinq prototyping work and will be discussing the rationale, requirements, etc.

  • Anonymous
    July 22, 2006
    Are you still using XLinq when you refer to the new Xml programming API that shipped with the Linq CTP because you see it as seperate from Linq to Xml or is it just because you haven't gotten used to the new name?

    Personally it seems like something different since the new Xml API can be used independently of Linq.

    By the way I like the Non-goals. :)

    Cheers,
    Steve
  • Anonymous
    August 17, 2006
    In a previous post I wrote:  There is no guarantee that XLinq classes can be subclassed effectively,...
  • Anonymous
    May 15, 2007
    A team within Microsoft ran an "app week" recently to build applications that implement customer scenarios