SQL Server XML: Sorting Data in XML Fragments
Working with data sets made us all aware of the fact that a set has no order. XML documents are not data sets, thus they always have a natural order.
So what should we do when we have XML fragments in the wrong order?
Problem Definition
The original problem was to sort a XML document with two levels: Get double-sorted xml document from xml-document
Approaches
A - Using T-SQL
Using T-SQL means, that we need to deconstruct our data by parsing the necessary values. We use the nodes() method to extract the elements on the level we want to sort it. And we extract the order criteria with the value() method to sort it with the ORDER BY clause. Finally we can reconstruct the XML fragment by using FOR XML with the PATH mode.
Here is the trivial case. We are completely deconstructing a flat hierarchy and use only the data in T-SQL to reconstruct the XML:
DECLARE @Data XML = N'
<element name="c" />
<element name="b" />
<element name="a" />
';
WITH Deconstructed AS (
SELECT Element.value('@name', 'NVARCHAR(255)') AS ElementName
FROM @Data.nodes('/element') [Elements] ( Element )
)
SELECT ElementName AS [@name]
FROM Deconstructed
ORDER BY ElementName
FOR XML PATH('element');
And this is the result:
<element name="a" />
<element name="b" />
<element name="c" />
A more complex case is the following. We are still working with a flat hierarchy, but now our elements are no longer trivial. The deconstruction process must now provide us with the sort criteria and the rest of the XML fragments per element:
DECLARE @Data XML = N'
<element name="c" >
<subelement name="t" />
</element>
<element name="b">
<subelement name="s" />
</element>
<element name="a" />
';
WITH Deconstructed AS (
SELECT Element.value('@name', 'NVARCHAR(255)') AS ElementName,
Element.query('.') AS ElementContent
FROM @Data.nodes('/element') [Elements] ( Element )
)
SELECT ElementContent AS '*'
FROM Deconstructed
ORDER BY ElementName
FOR XML PATH('');
Here is the result:
<element name="a" />
<element name="b">
<subelement name="s" />
</element>
<element name="c">
<subelement name="t" />
</element>
B - Using XQuery
Using XQuery means that we use the order clause of a FLWOR statement.
Here is the trivial case again:
DECLARE @Data XML = N'
<element name="c" />
<element name="b" />
<element name="a" />
';
SELECT Fragment.query('
for $element in /element
order by $element/@name ascending
return $element
')
FROM @Data.nodes('.') Fragment ( Fragment );
And the expected result:
<element name="a" />
<element name="b" />
<element name="c" />
As the XQuery FLWOR statement already works on nodes, we already have a solution for the more complex case:
DECLARE @Data XML = N'
<element name="c" >
<subelement name="t" />
</element>
<element name="b">
<subelement name="s" />
</element>
<element name="a" />
';
SELECT Fragment.query('
for $element in /element
order by $element/@name ascending
return $element
')
FROM @Data.nodes('.') Fragment ( Fragment );
And here is the result:
<element name="a" />
<element name="b">
<subelement name="s" />
</element>
<element name="c">
<subelement name="t" />
</element>
Problem Solution
Sorting the first level of the list:
DECLARE @Data XML = N'
<level1 name="3">
<level2 name="f" />
<level2 name="e" />
<level2 name="d" />
</level1>
<level1 name="2">
<level2 name="c" />
<level2 name="b" />
</level1>
<level1 name="1">
<level2 name="a" />
</level1>
';
SELECT Levels.query('
for $level1 in /level1
order by $level1/@name ascending
return $level1
')
FROM @Data.nodes('.') Levels ( Levels );
Here is the result, the list is only sorted on the top level:
<level1 name="1">
<level2 name="a" />
</level1>
<level1 name="2">
<level2 name="c" />
<level2 name="b" />
</level1>
<level1 name="3">
<level2 name="f" />
<level2 name="e" />
<level2 name="d" />
</level1>
Here we already see that we need a kind of nested sort, because we have only sorted the outer levels. In a FLWOR statement we can use complex return expressions, especially we can use further FLWOR statements:
DECLARE @Data XML = N'
<level1 name="3">
<level2 name="f" />
<level2 name="e" />
<level2 name="d" />
</level1>
<level1 name="2">
<level2 name="c" />
<level2 name="b" />
</level1>
<level1 name="1">
<level2 name="a" />
</level1>
';
SELECT Levels.query('
for $level1 in /level1
order by $level1/@name ascending
return
<level1 name="{$level1/@name}">{
for $level2 in $level1/level2
order by $level2/@name ascending
return $level2
}</level1>
')
FROM @Data.nodes('.') Levels ( Levels );
Now we have our double-sorted list:
<level1 name="1">
<level2 name="a" />
</level1>
<level1 name="2">
<level2 name="b" />
<level2 name="c" />
</level1>
<level1 name="3">
<level2 name="d" />
<level2 name="e" />
<level2 name="f" />
</level1>
Conclusion
Using the T-SQL approach means that we need to handle the conversion from and to XML to overcome the barrier between XML and T-SQL. While this is only a small step, it simply means more code. And more code is more complex per se.
The XQuery FLWOR expression on the other hand allows us to use a more compact notation. And this kind of XQuery processing was exactly built for these kinds of manipulation. It is the better choice in our case.
Terminology
Fragment: Part of an XML document
A fragment is not a document, thus it is not well-formed
FLWOR: FOR, LET, WHERE, ORDER BY, RETURN (XQuery)
See Also
- [[SQL Server 2012]]
- Transact-SQL Portal