Share via


How to Extract Data in XML to Meet the Requirements of a Schema

Introduction

This article is based on a question posted on the TechNet Forum Brazil for SQL Server - XML - Mode EXPLICIT, CDATA and will provide you a solution to a common problem; the formatting of a data query in T-SQL into XML that adequately meets the conditions of a XML Schema (XSD) or a Document Type Definition (DTD)

This is one of the possible solutions related to this problem. If you know other options in T-SQL that meet the needs of this problem, feel free to add your content to this article.

Problem

During my reading of the threads in the SQL forum, I found the following question that was discussed. 

The question was: "I'm trying to generate XML using EXPLICIT mode because I need to use CDATA in some fields. The problem is that an XML Schema requires that my XML have some AttributeName, such as "g: AttributeName". And WITH XMLNAMESPACES is not compatible with the EXPLICIT mode of T-SQL."

It is clear that the person who asked the question, even with some difficulty to ask, explains that his need is to get the XML data in the following format:

  • All contents of the structure must belong to the namespace "g"
  • Each record in the XML must be under the "item" tag
  • The root tag must remain as "xml"

The XML expected by the poster should result in something similar to this content:


<?xml version="1.0" encoding="utf-8"?>
<xml>
  <item>
    <g:id>1</g:id>
    <g:Name>test 1</g:Name>
  </item>
  <item>
    <g:id>2</g:id>
    <g:Name>test 2</g:Name>
  </item>
</xml>


Causes

Typically, the use of XML Schema or DTD aims to verify and validate the change and/or receiving information between two companies or departments that use different systems platforms. All these validation criteria are to maintain the data integrity between the system supplier and the receiver system.

This also occurs in environments with similar platforms, but to less extent. This need for data integration between companies is very old. Even different departments/branches need to ensure that their shared data is always updated. Today, SQL Server 2012 has the resources to handle this kinds of data processing that we will present, but these same features can be obtained with greater depth through BizTalk Server.

Diagnostic Steps

Once you diagnose the cause of the problem, we go to their proper resolution. There may be other solutions as an alternative, but the one indicated at the end of this article answers the question posted in the Forum in the simplest and most practical way possible.

Building the Scenario of the Problem

So that we can accurately simulate the problem and propose its solution, we build a table with little data, but similar to the situation shown in Threads Forum (Figure 1):


CREATE TABLE dbo.test (
Id INT IDENTITY,
CD_Product INT,
NM_Product VARCHAR(50)
)
GO

INSERT INTO dbo.test ( CD_Product , NM_Product )
VALUES ( 1,'test 1'),(2,'test 2')
GO



Figure 1 - Creating the table to demonstrate the solution

Solution

To structure the solution of the problem, one must be clear about all conditions of the XML Schema proponent of the question, even though it has not been submitted. 

Despite the proponent of the question to be trying to get the desired XML format via a query T-SQL in the EXPLICIT mode, this mode does not allow the condition of the XML Schema to the predecessor of the "g" namespace in this way we will be presenting the solution with T-SQL using the RAW mode query.

So to set the precedence of the "g" namespace URI, we set the table fields with this predecessor using the standard XML Schema separator character.

Each line must have a tag called "item", set in RAW mode, so the new name for the "row" tag should be "item".

To complete all the requirements stipulated by the proponent of the question, we define the ROOT function that the root of the all XML has the defined name "xml" tag.

The code of the proposed solution is the following:


WITH XMLNAMESPACES ('uri' as g)

    SELECT 

    CD_Product as 'g:ID', 
    NM_Product as 'g:NAME' 
  FROM dbo.test 

FOR XML RAW('item'), ROOT('xml'), ELEMENTS XSINIL


The result is displayed as expected by the person asking the question (Figure 2):


Figure 2 - XML Structured as defined in XML Schema

Additional Information

If you want to know how to consume and validate the contents of an  XML through  XSD or DTD, using the in VB.Net or C # programming language,
I recommend reading of Knowledge Bases(KB): 315533 and 318504.

Credits

This article was inspired by writing articles:

Thanks SandroNaomi, and Peter for the constant guidance in your articles. This motivated me to create this article!

To strengthen your knowledge about XMLXSD and DTD, I recommend reading of the following:


References

Read some advanced Libraries:


TechNet Library

Read some advanced Topics:


See Also

Read some advanced Topics:

Another important place to find a huge amount of articles related to XML and T-SQL is the same TechNet Wiki. 


Other Languages

This article can also be found in the following languages​​:       


This article was awarded the silver medal in the TechNet Guru of December/2013