Jaa


Embracing SQLCLR and the XML Data Type

Introduction – If I Can Use These Things, You Can

Microsoft SQL Server 2005 has been out for about a year and a half now, but I bet many solution developers have not jumped to leverage its major new features. Two of these features are CLR (.NET Common Language Runtime) stored procedures, or SQLCLR, and the new XML data type. I recently delivered a solution that features SQLCLR as the primary “platform”, if you will, and an XML column as major data storage. I learned a fair amount while fielding this application, some of which I thought worthy of sharing.

Part One - CLR Integration in SQL Server 2005

First, why bother with CLR stored procedures? What can you do with them that you can’t do with Transact-SQL (T-SQL)? Is there a downside to SQLCLR?

Using classic T-SQL stored procedures, you probably can accomplish much of what you can with SQLCLR. But how readable and maintainable is the code? I certainly like working with C# much better than T-SQL. Unlike T-SQL, C# is a modern, full-featured, object-oriented language. Leveraging the huge .NET Framework is bound to simplify the coding required to deliver complex data processing. But T-SQL is said to execute faster for the sheer reading and writing of database data. So I use both: SQLCLR procedures for the complex logic, and T-SQL procedures invoked from the C# for raw data access. “Why Build Database Code with .NET?” is a useful article discussing the pros and cons of SQLCLR, as compared to T-SQL. And so are “Overview of CLR Integration” and “Using CLR Integration in SQL Server 2005”.

From a bigger perspective, the advent of SQLCLR offers an exciting new opportunity for simplifying application architectures, by eliminating the need to build some middle tier components. You can now build full-blown data-driven processing right next to the data. For example, I replaced a Windows service, running on a separate server, with a SQLCLR stored procedure on the SQL Server that the Windows service worked with. And given the fact that you can invoke your SQLCLR procs via “Native XML Web Services”, you can avoid developing and fielding XML Web Service components. Then the Native XML Web Services could support AJAX-style web user interfaces, with access via XMLHTTP, or even Internet Explorer v7 Native XMLHTTP!

Whew! I will give you a moment to ponder the possibilities. Got a little excited there and went astray. Back to business now.

Okay, so maybe I sold you on SQLCLR. What next? Make sure your Microsoft SQL Server is set for running and debugging CLR by executing this T-SQL against it:

exec sp_enable_sql_debug

GO

exec sp_configure 'clr enabled', '1'

GO

Reconfigure

GO

More about enabling CLR can be read at “Enabling CLR Integration”.

Then, assuming you have Visual Studio 2005 handy, start a new database project. File, New, Project, Project type Visual C#, Database. You will set your database connection to your SQL 2005 server. “How to: Create a SQL Server Project” explains more about getting started with Visual Studio database projects. That article also leads you to information for “How to: Create and Run a CLR SQL Server Stored Procedure”. And there is even a walkthrough for writing your first SQLCLR stored procedure at “Walkthrough: Creating a Stored Procedure in Managed Code”.

After you have written a SQLCLR stored procedure in Visual Studio, you build it using the familiar IDE menu selections of Build, Build. But, in order to run your stored procedure, you get to do something a little new: Build, Deploy. Deploy, builds your project, and then sends it to your SQL Server as an Assembly. At this point your stored procedure should be found in a list seen with SQL Server Management Studio. But it should appear with a slight difference: a little padlock on the treeview icon next to the procedure name. This little lock indicates that you cannot modify this stored procedure using SQL Server Management Studio, like you can with T-SQL procedures. To modify it you need to go back the database project in Visual Studio, change the code, then re-deploy.

So you can’t change the code there, but you sure can execute it just like a T-SQL procedure. The input parameters for the stored procedure are defined by the parameters on the method in your C# code. For example a C# method signature:

[Microsoft.SqlServer.Server.SqlProcedure]

public static void up_Decide(string DecisionId, string GoalName)

yields a stored procedure definition of:

CREATE PROCEDURE [dbo].[up_Decide]

            @DecisionId [nvarchar](4000),

            @GoalName [nvarchar](4000)

Now, the C# method parameters mapping to the stored procedure parameters is straightforward to follow. So you readily see how you can execute your SQLCLR stored procedure, giving it the inputs it needs. But what comes back from it?

Not much, unless you are careful! If your SQLCLR stored procedure needs to return data, you can’t just use the normal “return” C# syntax. To make your procedure return the customary record set, readable using the .NET SqlDataReader class, you have to employ some special syntax, such as:

SqlPipe sp = SqlContext.Pipe;

SqlDataRecord sdr = new SqlDataRecord(new SqlMetaData("FinalDecision", SqlDbType.NVarChar, 100));

sp.SendResultsStart(sdr);

sdr.SetString(0, ReturnThisDataPlease);

sp.SendResultsRow(sdr);

sp.SendResultsEnd();

The above sample code will return a single-column, single-row recordset containing whatever the “ReturnThisDataPlease” variable holds. It took me some determined digging to come up with that code the first time I had to! Other options for returning data from your SQLCLR code can be found in “CLR Stored Procedures”.

So you have coded your SQLCLR stored procedure, built it, deployed it, and executed it. And let’s say it runs cleanly, with no exceptions, and returns a record set. But what if the data in the recordset is not what is expected? Maybe you have a logic error in your code! <GASP> Time to debug your logic. I bet you would like to step through the code! But how?

Well, once again I will rely on some fine MSDN content to help address the topic of SQLCLR debugging. “How to: Enable CLR Debugging For a Connection” should get you started in fine fashion. Once CLR debugging is enabled, you should be able to right-click on the procedure name, listed within Visual Studio’s Server Explorer, and choose “Step Into Stored Procedure”. Doing this should start the Visual Studio debugger, then place your yellow execution point at the first executable line in your C# code. Then you can step through the code line by line, examining variables, adding watch variables, executing commands in the immediate window, and all of the other wonderful debugging goodness that Visual Studio has to offer. This surely is some new-age stored procedure development.

An alternative to launching the debugger from the Server Explorer is to add a test script to your Visual Studio project – launching the debugger from this test project. “How to: Debug a SQL CLR Stored Procedure” explains how to do that.


Part Two – The XML Data Type

At the risk of sounding quite nerdy, I am excited about the XML data type in Microsoft SQL Server 2005! I am excited because the XML data type provides freedom – freedom from data modeling.

Using XML fields, you can add and remove elements at will, on the fly, no questions asked. No more trying to imagine all of the data elements you will need beforehand, and carefully adding each column to a table definition. And sparse data is no problem at all! Each row need not contain the same elements.

Okay, there must be limits to this free thinking. There must be plenty of cause for using regular fixed table columns, and adding them when needed, and being happy about it.

But I don’t want to think about that right now. I want to be freeform! J Let us quickly see how far we can get with some basic usage of the XML data type.

Imagine you create a SQL Server table, to record weather observations, with an XML column, such as:

CREATE TABLE [dbo].[tblWeatherObservations](

            [RecordId] [uniqueidentifier] NOT NULL,

            [CreatedDate] [datetime] NULL,

            [Observations] [xml] NULL)

Inserting a row into this table might look like:

DECLARE @RecordId uniqueidentifier

SET @RecordId = NEWID()

INSERT INTO tblWeatherObservations

(RecordId, CreatedDate, Observations)

VALUES

(@RecordId, GETUTCDATE(), '<Observations/>')

Ending up with a row of data:

4D18DD6D-5475-40E1-95F7-358DD6752667 2007-03-01 06:31:10.923 <Observations />

Now let’s imagine you want to expand the data in the StateVariables column to record a current temperature reading of 30:

UPDATE tblWeatherObservations

SET Observations.modify('insert <Temperature>30</Temperature> as last into (/Observations)[1]')

WHERE RecordId = @RecordId

Ending up with a row of data:

4D18DD6D-5475-40E1-95F7-358DD6752667 2007-03-01 06:31:10.923 <Observations><Temperature>30</Temperature></Observations>

That Observations.modify syntax is some XML Data Modification Language (XML DML). More examples of XML DML.

Now let’s imagine we had several weather observations, which got inserted into the table:

9676B8A4-AFEC-4F6B-AF92-97B2B4802F52 2007-03-01 07:05:08.847 <Observations><Temperature>0</Temperature><WindSpeed>5</WindSpeed></Observations>

3A7ACE4A-AC5C-46F6-8A17-5C1228F873C8 2007-03-01 07:05:08.863 <Observations><Temperature>10</Temperature><WindSpeed>15</WindSpeed></Observations>

9DD914F0-EF38-468F-9305-59ACD1BDB34A 2007-03-01 07:05:08.863 <Observations><Temperature>20</Temperature><WindSpeed>5</WindSpeed></Observations>

BD41D564-7B89-4F06-80DC-14CDA06EE5FE 2007-03-01 07:05:08.863 <Observations><Temperature>30</Temperature><WindSpeed>15</WindSpeed></Observations>

D56FB9B4-5C78-44A3-ADA2-A5174BA9751D 2007-03-01 07:05:08.863 <Observations><Temperature>40</Temperature><WindSpeed>5</WindSpeed></Observations>

Well, that data looks interesting, but how can we select just the Temperature data, like you could with a standard table column, when it is buried within that XML? I’ll tell you how:

SELECT

Observations.value('(data(//Temperature))[1]', 'int') AS Temperature

FROM tblWeatherObservations

which returns:

Temperature

-----------

0

10

20

30

40

That Observations.value syntax uses the value() method of the xml Data Type, and XQuery data selection. More xml Data Type methods are discussed at “xml Data Type Methods”.

Going a little further, you can employ a WHERE clause:

SELECT

Observations.value('(data(//Temperature))[1]', 'int') AS Temperature

from tblWeatherObservations

WHERE Observations.value('(data(//WindSpeed))[1]', 'int') > 5

or even run an function to find the average Temperature:

SELECT

AVG(Observations.value('(data(//Temperature))[1]', 'float')) AS AvgTemperature

from tblWeatherObservations

In this article, I have shown loose, freeform, untyped XML. You might consider being not so freeform, and tightly control your XML data, and read “Typed vs. Untyped XML”. And you might explore applying indexes to assure your XML selections are speedy.

Conclusion

I hope I shed some light on these two topics. There is little reason not to find applications for SQLCLR and the XML data type. These two Microsoft SQL Server 2005 features have much to offer, and can drastically streamline application designs. Together they have opened up new horizons when I imagine software application solutions.

Comments

  • Anonymous
    March 07, 2007
    The comment has been removed
  • Anonymous
    March 07, 2007
    Your article, intended to be informative and fun natured avoided addressing the relative controversy around SQLCLR and the tug and pull of who owns the DB, i.e. the dev or the DBA.  While there is certainly a range, DBA roles in many orgs more often than not are in charge of or at least contribute heavily to the DB design, data integrity, and overall architecture which includes protecting the db from dev’s who may not know best practices regarding db coding and design.  Giving the dev’s the ability to code in a .net language directly on the db essentially by-passes the DBA and potentially makes for headaches with code that can only be altered inside of VS2005.  The worst being that it allows the dev to not have to worry about the best tools/methods for the job.  Don't get me wrong, I like dev's however I have met plenty that don't understand the efficiencies of set based queries, appropriate normalization and indexing. They also aren't the ones that get called in the middle of the night when the jobs are failing because a dev decided to deploy a new validation process to check each row one at a time with their fancy new .net business logic.  In a mature org, hopefully the testers would have caught that - but you know where I am going.  What does anyone else think, time for DBA's to learn C# or assert their authority :-P and mitigate the risk by containing .net code to very specialized edge cases?  Or both?