How to Query Trees Using LINQ
http://c.statcounter.com/8988981/0/1070c577/1/A (directed) tree is one of the most common data types. But how do you use trees in LINQ? This article introduces a generic LINQ-Extension which solves the problem, gives an example and discusses alternatives.
Motivation
The starting point of this article was a simple question in the LINQ forum which has been unanswered for four years...
How can I select tree nodes independently from their nesting level?
So it might be worth to have a closer look at the task. Let's start with an example. A simple TreeNode class may look like this:
public class TreeNode
{
public string Id { get; set; }
public int Data { get; set; }
public List<TreeNode> Children { get; private set; }
public TreeNode(string id, int data)
: this(id, data, new TreeNode[0])
{ }
public TreeNode(string node, int data, params TreeNode[] children)
{
Id = node;
Data = data;
Children = new List<TreeNode>(children);
}
}
The test data are:
TreeNode rootNode =
new TreeNode("node-0", 35,
new TreeNode("node-1", 17,
new TreeNode("node-2", 20)),
new TreeNode("node-3", 19),
new TreeNode("node-4", 5,
new TreeNode("node-5", 25),
new TreeNode("node-6", 40)));
Solution
Implement a generic extension class which can traverse over arbitrary trees. The only precondition is that the node's children are stored as an IEnumerable. The selector lambda expression avoids hard-coding of the access path and makes the extension reusable.
public static class LinqTreeExtension
{
public static IEnumerable<T> SelectNestedChildren<T>
(this IEnumerable<T> source, Func<T, IEnumerable<T>> selector)
{
foreach (T item in source)
{
yield return item;
foreach (T subItem in SelectNestedChildren(selector(item), selector))
{
yield return subItem;
}
}
}
}
The extension method SelectNestedChildren traverses the tree depth-first, left to right.
Usage Example
**Example 1: **Flattening the tree structure (without the root node)
var flattenedTreeWithoutRoot
= rootNode.Children
.SelectNestedChildren(t => t.Children).ToList();
The resulting list enumerates the nodes with IDs "note-1" to "note-6". Please note: The root node is not contained in the resulting list. We'll fix this in the next example.
**Example 2: **Flattening the tree structure (including the root node)
var flattenedTreeIncludingRoot
= new TreeNode[] {rootNode }
.SelectNestedChildren(t => t.Children).ToList();
The query is executed on an array, which contains only the root node. The resulting list enumerates the nodes with IDs "note-0" to "note-6" as expected.
Example 3: Performing a LINQ query on all nested children
var result
= new TreeNode[] { rootNode }
.SelectNestedChildren(t => t.Children)
.Where(t => t.Data > 30).ToList();
The resulting list contains two nodes with the IDs "node-0" and "node-6"
Contradictions and alternatives
The proposed LINQ extension has significant pros:
- The extension is applicable to all tree-structures which use an IEnumerable to hold the children.
- It fits seamlessly into other LINQ operations.
However, in some cases you may want to consider an alternative:
Cyclic graph instead of a tree
**Problem: **The extension will loop forever if it runs into a cycle of the graph.
Solution: Modify the extension:
- Add a HashSet<T> to the extension which keeps track of the already visited notes.
- Your class T should override GetHashCode() and Equals()
- The extensions have to skip already visited nodes.
Persisted tree in a database
Problem: If the tree is stored in a database (and not in memory), the extension produces unnecessary overhead.
Solution: All tree nodes reside in a database table which can be queried directly. The effect that they represent a logical tree or even a cyclic graph can be ignored. Using the Entity Framework the query looks like this:
var databaseResult = context.TreeNodes
.Where(t => t.Data > 30).ToList();
However, at least two small changes should be applied to the sample class TreeNode:
- The property Children should be changed to a virtual property to enable lazy loading.
- The Entity Framework requires an additional parameterless constructor.
XML data structures
An XML data structure is a special case of a directed tree.
Problem: LINQ to XML has specialized on XML Documents.
Solution: Use LINQ to XML. It reduces your coding effort. You stick to a standard. You may also use other benefits from LINQ to XML. Example:
var xmlElements = XDocument.Load("myfile.xml").Descendants();
var result = xmlElements.Where(x => ...);
XDocument.Descendents() returns an IEnumerable<XElement> which can be used in a query directly.
Conclusion:
Using trees with LINQ is simple. However, there is no "one size fits all" solution.
Links
101 LINQ Samples
Covers all essential aspects of LINQ with examples.LINQ (Language Integrated Query)
Documentation of LINQ in the MSDN Library.LINQ to XML
Documentation of LINQ to XML in the MSDN Library.
See Also
User Page: Carsten Siemens
Information about other TechNet Wiki articles written by Carsten Siemens.
Another important place to find a huge amount of Visual C# related articles is the TechNet Wiki itself. The best entry point is Visual C# Resources on the TechNet Wiki
Other languages
This article is also available in other languages: