H is for… HierarchyID
hierarchyid
is a new system data type in SQL Server 2008 used to represent a position within a hierarchy. The value of an hierarchyid
column is an encoded representation of the path from the root to the given node (row) – kind of like a directory/file path, only using numbers instead of names.
To get a better idea of how all this works, I’m going to use one of the more common scenarios where a hierarchy is found, that of an organizational chart (Another manifestation is a bill-of-materials, where one component recursively comprises a number of other components below it). The AdventureWorks2008 database contains a HumanResources.Employee
table with the following (partial) definition, including the OrganizationNode
column of type hierarchyid
. Note that there’s an OrganizationLevel
as well, a computed value based on the OrganizationNode
. We’ll look at that in a moment, but for now, let’s focus on OrganizationNode
.
CREATE TABLE [HumanResources].[Employee](
[BusinessEntityID] [int] NOT NULL,
[NationalIDNumber] [nvarchar](15) NOT NULL,
[LoginID] [nvarchar](256) NOT NULL,
[OrganizationNode] [hierarchyid] NULL,
[OrganizationLevel] AS ([OrganizationNode].[GetLevel]()),
[JobTitle] [nvarchar](50) NOT NULL,
[BirthDate] [date] NOT NULL,
...
Here are some results from the following query on this table:
SELECT LoginID, JobTitle, OrganizationNode, OrganizationNode.ToString()FROM HumanResources.Employee;
You can see the native encoded value in the third column and a text representation in the last column (generated by using the ToString()
method of the underlying CLR type that implements this new data type).
The root node has a representation of /, and its children are numbered sequentially, and so on. So at the fourth level, we have the employee with login rob0 reporting to roberto0 along with his colleagues gail0, jossef0, and dylan0. Since a picture is worth a thousand rows, here’s a schematic of how the relationships would be laid out. [N.B., this is not the complete picture of the data in the Employee table, only that shown in the first nine lines of the query results above].
It turns out that there is actually an ordering within a level of the hierarchy as well, although in this scenario it may not be relevant. For example, the OrganizationNode
value of rob0 is “less” than that of gail0, which could imply seniority in terms of length of service or perhaps an ordering by department number or last name.
All this seems logical enough, but how do we actually get data into the table and how do we write queries to return the subsets of the data that we’re interested in? The answer lies in the fact that hierarchyid
is a CLR type, complete with methods to provide additional functionality (like the ToString()
above). The table below provides an overview of the methods available. With the exception of Parse
and GetRoot
, these are each instance methods on a given hierarchyid
value. [Astute readers may note there are Read
and Write
methods as well, but they are used internally by SQL Server, and it’s unlikely you’d interact with them directly.]
Method | Purpose |
GetAncestor(n) |
returns a value representing the nth ancestor |
GetDescendant(m, n) |
returns a value representing a new child node between other children nodes, m and n |
GetLevel(n) |
returns level at which node n is located |
GetReparentedValue(m, n) |
returns node whose ancestor is modified from m to n |
GetRoot() |
returns the root of the hierarchy |
IsDescendantOf(n) |
returns true if the node is a descendant of n |
Parse(s) |
converts canonical string format, s, to a hierarchyid value (opposite of ToString ) |
ToString() |
returns canonical string format |
Let’s take a look at each of these methods in turn with some simple samples focused on the subset of the organization chart data presented above.
GetAncestor
Return diane1’s skip-level manager (roberto0):
select loginid from HumanResources.Employee where OrganizationNode = (select e.OrganizationNode.GetAncestor(2) from HumanResources.Employee e where LoginID = 'adventure-works\diane1');
GetDescendant
You’d use
GetDescendant
to generate the right value for a new record you want to insert into the existing hierarchy. Here, for instance, we’ll return a new employee (node) between gail0 and jossef0.declare @emp1 hierarchyid declare @emp2 hierarchyid select @emp1 = OrganizationNode from HumanResources.Employee where LoginID = 'adventure-works\gail0' select @emp2 = OrganizationNode from HumanResources.Employee where LoginID = 'adventure-works\jossef0' select @emp1.GetAncestor(1).GetDescendant(@emp1, @emp2), @emp1.GetAncestor(1).GetDescendant(@emp1, @emp2).ToString();
This query returns the following:
Note that the notation for the new node uses the period to open up a new spot between the existing values. The encoding of the
hierarchyid
type is limited to 892 bytes; however, the storage scheme is efficient. For an organization of 100,000 with an average fanout of six (meaning each manager has on average six direct reports), only 5 bytes of storage are needed for each value.
GetLevel
This method returns the level of the given node in the hierarchy. It’s the first method we ran into, since it was part of the table definition itself. It’s included as part of the table definition so that the values can be indexed for more efficient breadth-first queries (along the lines of “who are all of the direct reports of roberto0”). This complements an index on the
hierarchyid
column itself, which organizes the nodes in a depth-first manner.
GetReparentedValue
GetReparentedValue
is useful for moving nodes around in the hierarchy. For instance, assume there was a corporate reorganization, and dylan0 now reports into terri0 (dylan0’s previous skip-level manager). You could set up this modification via the following syntax.select @oldboss = OrganizationNode from HumanResources.Employee where LoginID = 'adventure-works\roberto0' select @newboss = OrganizationNode from HumanResources.Employee where LoginID = 'adventure-works\terri0' update HumanResources.Employee set OrganizationNode = OrganizationNode.GetReparentedValue(@oldboss, @newboss) where LoginId = 'adventure-works\dylan0'
Here’s the result of this update, which you can compare with the original query results at the top of this post.
Look carefully though; only Dylan has moved, and we’ve essentially orphaned his previous direct reports, diane1 and gigi0. This brings up the point that the hierarchy relationship is not enforced like, say, a primary/foreign key pair. It’s up to you to makes sure that the relationships are kept in a consistent and meaningful state in terms of how your applications deal with the hierarchies.
Now, moving the entire subtree around takes a little more effort, but there are several posts providing insight on how to do so.
N.B., you will find a lot of references to a
Reparent
method in the documentation and other postings.Reparent
was changed toGetReparentedValue
in the RTM version of SQ LServer 2008. Likewise,IsDescendant
was renamedIsDescendantOf
.
GetRoot
GetRoot
provides a quick way to get to the top level of the hierarchy. For instance, I can access the CEO directly from gigi0 with a query like:
select loginid from HumanResources.Employee where OrganizationNode = hierarchyid::GetRoot();
Note that
GetRoot
is a static method.
IsDescendantOf
This method returns a boolean indicating whether or not the
hierarchyid
argument of the method is an ancestor of the instance on which the method is called. Here’s an example returning all those employees in the management chain of dylan0.declare @emp1 hierarchyid select @emp1 = OrganizationNode from HumanResources.Employee where LoginID = 'adventure-works\dylan0' select e.LoginId from HumanResources.Employee e where e.OrganizationNode.IsDescendantOf(@emp1) = 1;
Note that
IsDescendantOf
traverses the entire hierarchy (and it includes the given node as a descendant of itself), so if I wanted just the direct reports of dylan0, I should include an additional criterion:select e.LoginId from HumanResources.Employee e where e.OrganizationNode.IsDescendantOf(@emp1) = 1 and e.OrganizationNode.GetLevel() = @emp1.GetLevel() + 1;
Parse and ToString
Parse
, likeGetRoot
, is a static method, and it is essentially the inverse ofToString
. So, for instance, the following query should (and does!) return the number of rows in the Employee table.
select count(*) from HumanResources.Employee where OrganizationNode = hierarchyid::Parse(OrganizationNode.ToString());
All of the samples I’ve included so far just show T-SQL from SQL Management Studio, so how do you incorporate this new datatype in your applications? It’s not much different, actually. The hierarchyid
type, being a CLR type itself, is exposed in the Microsoft.SQLServer.Types
assembly as SqlHierarchyId
.
Following is a excerpt from a console program that outputs the skip-level manager of diane1 (similar to the T-SQL sample above for GetAncestor
).
String empLogin = @"adventure-works\diane1";
SqlHierarchyId bossId;
String bossLogin;
SqlConnection conn = new SqlConnection(
"Data Source=.;Initial Catalog=AdventureWorks2008;Integrated Security=True");
conn.Open();
SqlCommand cmd = new SqlCommand("select OrganizationNode.GetAncestor(2) " +
"from HumanResources.Employee where LoginID=@login", conn);
cmd.Parameters.AddWithValue("login", empLogin);
bossId = (SqlHierarchyId) cmd.ExecuteScalar();
cmd = new SqlCommand("select LoginId from HumanResources.Employee " +
"where OrganizationNode=hierarchyid::Parse(@node)", conn);
cmd.Parameters.AddWithValue("node", bossId.ToString());
bossLogin = (String)cmd.ExecuteScalar();
conn.Close();
Console.WriteLine("{0} is the skip-level manager of {1}", bossLogin, empLogin);
Console.ReadLine();
Unfortunately though, this new data type and others introduced with SQL Server 2008 aren’t yet supported in the newer data access technologies like LINQ and the Entity Framework, so if you are mainpulating hierarchyid
values in your code, you’ll likely be using the tried-and-true, classic ADO.NET mechanisms – just like I did above.
I'll leave you with some additional resources on hierarchyid
and other new features in SQL Server 2008, so you can read more about it:
SQL Server 2008: Enabling Developers to go Beyond Relational
SQL Server 2008 for Developers
HierarchyID: Use a new data type to maintain and query hierarchical data
Comments
Anonymous
April 29, 2009
Fantastic Article. Just the right level of information. Excellent examples from the new adventureworks2008 DB that I can make use of in my training.Anonymous
April 29, 2009
Thanks for the feedback, Jane. I really appreciate it and glad it was helpful. Keep your eye out as well for a SQL Server 2008 Training Kit that should be out pretty soon, it looks to have some great content as well.Anonymous
July 17, 2009
The comment has been removedAnonymous
July 17, 2009
TBone, it's case-sensitive. Try select hid.GetLevel() from x.Anonymous
October 06, 2009
The comment has been removedAnonymous
October 26, 2009
'GetAncestor' ... See the method reference... it's all case sensitive as the hierarchyid is effectively a CLR: http://technet.microsoft.com/en-us/library/bb677193.aspxAnonymous
March 27, 2010
This is a great article. I am very excited with this new feature of SQL as it has tremendous potential, especially in our environment. Cheers!Anonymous
July 08, 2010
How can I tell if a hierarchyid has any descendants (Am I a Leaf or a Node)Anonymous
July 09, 2010
Sam, basically select using IsDescendant and see if the count is > 1 (a node is considered to be a descendant of itself). For example, ken0 is the CEO and ben0 is a buyer (leaf node): // this returns 290 (not a leaf) SELECT COUNT() FROM HumanResources.Employee WHERE OrganizationNode.IsDescendantOf((SELECT OrganizationNode FROM HumanResources.Employee WHERE LoginID = 'adventure-worksken0')) = 1 // this returns 1 (leaf node) SELECT COUNT() FROM HumanResources.Employee WHERE OrganizationNode.IsDescendantOf((SELECT OrganizationNode FROM HumanResources.Employee WHERE LoginID = 'adventure-worksben0')) = 1 HTH, JimAnonymous
November 03, 2010
Great article! Is there a way to generate the nice pictures from your data?Anonymous
November 04, 2010
For the post, I 'cheated'. I used PowerPoint SmartArt (one of the hierarchy formats) to generate the pictures from the values I entered manually. That said, PowerPoint does have object model you can programmatically access via VSTO (Visual Studio Tools for Office), so there may be a way to generate them.Anonymous
July 23, 2012
Why would this ever be needed? Entity Framework does not support it, LINQ2SQL does not support it...I fail to understand why all of these unsupported bells and whistles keep appearing when they are pretty much useless to anyone developing a business application.