Returning child rows formatted as JSON in SQL Server queries
In this post I will talk about one annoying thing – how to return a result set containing one to many relationships between tables (i.e. parent'child relationships such as company-products, person-address, sales order-items, etc.)?
Problem
I have relational structure with several one to many relationships (e.g. Person may have several phones, or several email addresses). In order to read Person and related information; In need to run several queries (e.g. one to return person info, another to return person phones, third to return person email addresses), or join tables and process one flat result set on the client side.
It would be nice if I could retrieve person information and all related information with a single SQL query.
Solution
Related information can be formatted as JSON arrays and returned as arrays in parent Person row. this way we need a single query to return all Person related information.
Scenario - Exporting complex relational data structures as JSON
In practice you will have one to many relationships in many cases, e.g. person can have many email addresses, many phones, etc. If you want to store these information in relational database, you would need to use something like a structure on the following figure:
Even for the simple structures such as phone numbers or emails that will be represented as arrays in C# or Java, you would need to create separate tables with foreign key relationships. Now if you try to join these tables you would need to use something like a following query:
SELECT Person.Person.BusinessEntityID, Person.Person.FirstName,
Person.Person.LastName, Person.EmailAddress.EmailAddress,
Person.PersonPhone.PhoneNumber, Person.PhoneNumberType.Name AS PhoneNumberType
FROM Person.Person INNER JOIN
Person.EmailAddress ON Person.Person.BusinessEntityID = Person.EmailAddress.BusinessEntityID INNER JOIN
Person.PersonPhone ON Person.Person.BusinessEntityID = Person.PersonPhone.BusinessEntityID INNER JOIN
Person.PhoneNumberType ON Person.PersonPhone.PhoneNumberTypeID = Person.PhoneNumberType.PhoneNumberTypeID
WHERE Person.Person.BusinessEntityID = 274
If you run this query in AdventureWorks database, you will get information about person with email address and phone number:
BusinessEntityID |
FirstName |
LastName |
EmailAddress |
PhoneNumber |
PhoneNumberType |
274 |
Stephen |
Jiang |
stephen0@adventure-works.com |
112-555-6207 |
Work |
One row is returned because in AW database there is only one telephone number and email per person. Similar results will be returned if you query Sales.vSalesPerson view.
Now, since database schema allows us to add more than one address what would happen if you add another email address to Stephen?
INSERT INTO Person.EmailAddress(BusinessEntityID,EmailAddress)
VALUES (274,'stephen.jiang@outlook.com')
If you execute this query again, you will get the following result:
BusinessEntityID |
FirstName |
LastName |
EmailAddress |
PhoneNumber |
PhoneNumberType |
274 |
Stephen |
Jiang |
stephen0@adventure-works.com |
112-555-6207 |
Work |
274 |
Stephen |
Jiang |
stephen.jiang@outlook.com |
112-555-6207 |
Work |
Now let’s assume that someone adds Stephen’s home and cell phone numbers in the system. Query will now return the following results:
BusinessEntityID |
FirstName |
LastName |
EmailAddress |
PhoneNumber |
PhoneNumberType |
274 |
Stephen |
Jiang |
stephen0@adventure-works.com |
112-555-6207 |
Work |
274 |
Stephen |
Jiang |
stephen.jiang@outlook.com |
112-555-6207 |
Work |
274 |
Stephen |
Jiang |
stephen0@adventure-works.com |
238-555-0197 |
Cell |
274 |
Stephen |
Jiang |
stephen.jiang@outlook.com |
238-555-0197 |
Cell |
274 |
Stephen |
Jiang |
stephen0@adventure-works.com |
817-555-1797 |
Home |
274 |
Stephen |
Jiang |
stephen.jiang@outlook.com |
817-555-1797 |
Home |
This is a messy because now instead of a single row you are getting 6 rows because two email addressed are cross combined with three phone numbers.
Similar results will be returned if you query Sales.vSalesPerson view:
SELECT BusinessEntityID AS ID, FirstName, LastName, EmailAddress, PhoneNumber, PhoneNumberType
FROM Sales.vSalesPerson
WHERE BusinessEntityID = 274
Now you have two choices – handle these duplicates in client-side or try to return related data in single row. In this post I will show you how to use the second option with FOR JSON clause in SQL Server 2016.
Use Case 1: Formatting set of related rows as JSON array
Instead of joining related tables we can just attach related information as an array of records formatted as JSON array. We can select data from Person table, and add related email addresses as subquery formatted as JSON text:
SELECT Person.Person.BusinessEntityID, Person.Person.FirstName, Person.Person.LastName,
(SELECT Person.EmailAddress.EmailAddress
FROM Person.EmailAddress
WHERE Person.Person.BusinessEntityID = Person.EmailAddress.BusinessEntityID
FOR JSON PATH) AS Emails
FROM Person.Person
WHERE Person.Person.BusinessEntityID = 274
Since FOR JSON clause returns a single text field you can put it in any column:
ID |
FirstName |
LastName |
Emails |
274 |
Stephen |
Jiang |
[{"EmailAddress":"stephen0@adventure-works.com"}, {"EmailAddress":"stephen.jiang@outlook.com"}] |
In this case we are returning hybrid result set - relational data from Person table are returned as columns, while related information from Person.emailAddress table are returned as an array of JSON objects.
You might notice that FOR JSON returns an array of key:value pairs even if we might like to have something simpler e.g. plain array of values instead of array of objects. In this case we can write simple T-SQL user defined function that removes keys from the array and return plain array:
SELECT Person.Person.BusinessEntityID AS ID, Person.Person.FirstName, Person.Person.LastName,
dbo.ufnToRawJsonArray((SELECT Person.EmailAddress.EmailAddress
FROM Person.EmailAddress
WHERE Person.Person.BusinessEntityID = Person.EmailAddress.BusinessEntityID
FOR JSON PATH), 'EmailAddress') AS Emails
FROM Person.Person
WHERE Person.Person.BusinessEntityID = 274
Result of this query might look like:
ID |
FirstName |
LastName |
Emails |
274 |
Stephen |
Jiang |
["stephen0@adventure-works.com","stephen.jiang@outlook.com"] |
This is more readable format that array of key value pairs.
Use Case 2: Formatting related information from multiple child tables as JSON arrays
In the previous use case, I have returned content of one related child table as JSON array. We can also return related information both from emails and phone tables. Second set of related information would be returned as column level JSON expression:
SELECT Person.Person.BusinessEntityID AS ID, Person.Person.FirstName, Person.Person.LastName,
(SELECT Person.EmailAddress.EmailAddress
FROM Person.EmailAddress
WHERE Person.Person.BusinessEntityID = Person.EmailAddress.BusinessEntityID
FOR JSON PATH) AS Emails,
(SELECT Person.PersonPhone.PhoneNumber, Person.PhoneNumberType.Name AS PhoneNumberType
FROM Person.PersonPhone INNER JOIN
Person.PhoneNumberType ON Person.PersonPhone.PhoneNumberTypeID = Person.PhoneNumberType.PhoneNumberTypeID
WHERE Person.Person.BusinessEntityID = Person.PersonPhone.BusinessEntityID
FOR JSON PATH) AS Phones
FROM Person.Person
WHERE Person.Person.BusinessEntityID = 274
In this case, we have returned a single row for a particular person and all related information are returned as an array of JSON object in the single line.
ID |
FirstName |
LastName |
Emails |
Phones |
274 |
Stephen |
Jiang |
[ {"EmailAddress":"stephen0@adventure-works.com"}, {"EmailAddress":"stephen.jiang@outlook.com"} ] |
[ {"PhoneNumber":"112-555-6207","PhoneNumberType":"Work"}, {"PhoneNumber":"238-555-0197","PhoneNumberType":"Cell"}, {"PhoneNumber":"817-555-1797","PhoneNumberType":"Home"} ] |
You can easily unpack these arrays on the client side using some JSON deserializer such as JSON.NET. You might notice that we still have one array. The fact that we are adding new child items do not affects number of returned results.
Use Case 3: Formatting parent and child rows as JSON
Instead of hybrid result set, we can format all results as JSON (i.e. both parent Person columns and child rows). If you want to return all results as JSON there is even simpler syntax. FOR JSON AUTO will automatically indent related email addresses as it is shown in the following script:
SELECT Person.Person.BusinessEntityID, Person.Person.FirstName, Person.Person.LastName, Person.EmailAddress.EmailAddress
FROM Person.Person INNER JOIN
Person.EmailAddress ON Person.Person.BusinessEntityID = Person.EmailAddress.BusinessEntityID
WHERE Person.Person.BusinessEntityID = 274
FOR JSON AUTO
Results might look like:
[ { "ID":274,"FirstName":"Stephen","LastName":"Jiang", "Person.EmailAddress":[ {"EmailAddress":"stephen0@adventure-works.com"}, {"EmailAddress":"stephen.jiang@outlook.com"} ] } ] |
This format is pure JSON and it can be used as a response of JSON web service.
Use Case 4: Returning complex JSON hierarchies
Finally we can format entire result-set as JSON by using FOR JSON cause on the main query:
SELECT Person.Person.BusinessEntityID AS ID, Person.Person.FirstName, Person.Person.LastName,
(SELECT Person.EmailAddress.EmailAddress
FROM Person.EmailAddress
WHERE Person.Person.BusinessEntityID = Person.EmailAddress.BusinessEntityID
FOR JSON PATH) AS Emails,
(SELECT Person.PersonPhone.PhoneNumber, Person.PhoneNumberType.Name AS PhoneNumberType
FROM Person.PersonPhone INNER JOIN
Person.PhoneNumberType ON Person.PersonPhone.PhoneNumberTypeID = Person.PhoneNumberType.PhoneNumberTypeID
WHERE Person.Person.BusinessEntityID = Person.PersonPhone.BusinessEntityID
FOR JSON PATH) AS Phones
FROM Person.Person
WHERE Person.Person.BusinessEntityID = 274
FOR JSON PATH
Results will look like:
[ { "ID":274,"FirstName":"Stephen","LastName":"Jiang", "Emails":[ {"EmailAddress":"stephen0@adventure-works.com"}, {"EmailAddress":"stephen.jiang@outlook.com"}], "Phones":[ {"PhoneNumber":"112-555-6207","PhoneNumberType":"Work"}, {"PhoneNumber":"238-555-0197","PhoneNumberType":"Cell"}, {"PhoneNumber":"817-555-1797","PhoneNumberType":"Home"} ] } ] |
FOR JSON clause enables you to format complex results and return them in more convenient format to the client. Note that you will need SQL Server 2016 CTP2 or higher for these code examples.
Comments
Anonymous
October 12, 2015
In terms of performance will this perform better than reading the data and converting to JSON in C#. If this is not tested with lot of data I can work on it but if there are any metrics on this it would be useful to knowAnonymous
October 12, 2015
There are always corner cases where you can show that one approach is better than the another. If you need to join parent table (Person) with several 1:N child tables (e.g. emails, phones, pets, tags, etc.) you will multiply the same information about person in each row (see example where 274|Stephen|Jiang cells are repeated 6 times due to 2 emails and 3 phones). If you want to avoid this, you would need to open MARS connection where you will in one result set read person rows, and in another batch read phones, emails, pets using the separate queries? Would this be faster that returning everything as one row - I don't know , but this is easier especially if you need to return JSON as some AJAX response. I have compared performance of OData services implemented using WCF/EF and directly executed FOR JSON. In my case, WCF/EF was better for smaller/flat JSON responses, but FOR JSON was better for JSONs with some hierarchy. Again, in other cases it might be different.Anonymous
October 20, 2015
Hi, Is there any idea when it will be available to MS Azure SQL Server? TksAnonymous
September 28, 2016
The comment has been removed- Anonymous
October 11, 2016
We had the same problem recently. The solution was to first store the JSON string in an nvarchar(max) variable and then return the variable.
- Anonymous