Udostępnij za pośrednictwem


TSQL Tips: Writing more efficient DDL (Data Definition Language) with system functions

 

The internet is swimming with examples of really inefficient TSQL code. Just because everyone does it that way doesn’t make it optimal. I’ve already blogged about the issues you will find with NOLOCK hints, this post show that using system functions correctly can result in your Data Definition Language (DDL) running 3 times faster.

Example 1: Faster way to check existence of a table (object)

Common but Slowest

If you Right Click on a Table in SQL”s Management Studio you may enjoy the benefits of “Script Table As…”. Among other options it lets you create a script to “DROP AND CREATE”. It will generate a script similar to

 -- Common Style of DML Check, then Drop & Create a table.
IF NOT EXISTS ( SELECT * 
                FROM sys.objects
                WHERE object_id = OBJECT_ID(N'dbo.MyTable')
                AND type IN (N'U')
               ) 
CREATE TABLE dbo.MyTable(Rep_ID INT Null) 

For this discussion I’d draw your attention to the TSQL that searches for existence of the user table ie:

 -- Popular Approach
    SELECT * 
    FROM sys.objects 
    WHERE object_id = OBJECT_ID(N'dbo.MyTable') AND type IN (N'U')

Better use of System function & faster

A quick read of SQL’s Books Online will make you aware that the SQL Function OBJECT_ID() has a 2nd parameter that lets you specify the type of the object you are searching for. If you use it, you will find that your query runs twice as fast. Why? There was no need to test the condition on the Type column separately.

 -- Better: Use the 2nd Parameter instead of an AND   - Same results but neater
    SELECT * 
    FROM sys.objects 
    WHERE object_id = OBJECT_ID(N'dbo.MyTable', N'U')

You may have read that “SELECT *” was inefficient & you should always specify the columns you need. That is very true, in most circumstances. The exception to the rule is when the select is part of the “EXISTS” clause as it is here. In this case, the SQL Optimiser is smart enough to know that it only needs to know if a row is returned & doesn’t care about any of the columns in the select list. So performance in the following alternative is exactly the same. Use whatever you prefer, either alternative is slower than the last alternative.  

 -- Better: Use the 2nd Parameter instead of an AND   - Same speed as SELECT *
    SELECT 1 
    FROM sys.objects 
    WHERE object_id = OBJECT_ID(N'dbo.MyTable', N'U')

Fastest

As the OBJECT_ID() function can be used in all TSQL, there is no need to wrap it up in a SELECT statement. It returns NULL if the object is not found so just compare its output to NULL. e.g:

 -- Faster: just use the function & 2nd parameter - 3 Times faster
    IF OBJECT_ID(N'dbo.MyTable', N'U') IS NOT NULL
        CREATE TABLE dbo.MyTable(Rep_ID INT Null)
--Only save millisecs here but same applies with any IF EXISTS which could be way more expensive. (Use an index instead of table scan etc.)

Not only is this is 3 times faster. It means less code, so less chance of making errors. Most importantly it avoids the bad (& unsupported) practice if querying the system tables directly (in this case sys.objects).

In case you haven’t heard, you should be using the Informational Schema Views & System Functions instead of the system tables because Microsoft may change the system tables between releases which will break all your code. Yes I know “everyone” does it. Yes I am a hypocrite as many of my tuning tools are very vulnerable to System Table changes. But Microsoft was saying this from Version 1.1 (in 1990) they never changed & DBA’s got blasé about it. But 15 years later with the release of SQL 2005 we found sysobjects became sys.objects. While they did a great job of backward compatibility it does highlight the risk you take using System Tables in the heart of your application.

Post thought

You might wonder “why didn’t the SQL Server Management Studio doesn’t produce a better example of code. Surely they know how”. The tools team were likely to have a different goal to you. They were writing a context menu function that worked anywhere in the Object Explorer tree; Tables, Views, Functions Indexes etc. While many of these objects can be handled by OBJECT_ID(), some can’t. So it is more expedient to write a generic routine what worked everywhere than to produce optimal code samples with heaps of special case code. Especially when in this case the % improvement my be huge, but the actual time saved is still measured in millisecs, so who cares.

This blog is about the concept of system functions. And for some queries it makes a huge difference.

Example 2: Faster way find indexes on User Tables.

Perhaps you want to see all the Indexes in your system but don’t want the results cluttered with indexes on the System Catalog. Unfortunately when you query the Sys.Indexes table it doesn’t tell you if the Index is on a User Table, a System Table or an Indexed View. So you find yourself frequently joining on the Sys.Objects table & filtering with Type = N’U’ eg:

 -- 2 table join to filter on User Tables.
SELECT  OBJECT_SCHEMA_NAME(I.object_id) AS [Schema],
        OBJECT_NAME(I.object_id) AS [Table],
        I.* 
FROM sys.indexes AS I
JOIN sys.objects AS O ON O.object_id = I.object_id
WHERE O.type = 'U'

Often I’ve found that I needed to think about indexes on Indexes views too. So it was better just to filter out all system objects than to limit my query to User Tables. So the query changes to :-

 -- 2 table join to filter out System Objects.
SELECT  OBJECT_SCHEMA_NAME(I.object_id) AS [Schema],
        OBJECT_NAME(I.object_id) AS [Table],
        I.* 
FROM sys.indexes AS I
JOIN sys.objects AS O ON O.object_id = I.object_id
WHERE O.schema_id <> 4

But as I only wanted to filter the System Objects out of the Sys.Indexes Table I didn’t need the join with sys.objects at all. As I have the Object_ID I could use OBJECT_ID() or OBJECT_SCHEMA_NAME() to remove rows I didn’t like. eg:

 -- Single table scan, Filtered by System Function, nearly 4 times faster.
SELECT  OBJECT_SCHEMA_NAME(I.object_id) AS [Schema],
        OBJECT_NAME(I.object_id) AS [Table],
        I.* 
FROM sys.indexes AS I
WHERE OBJECT_SCHEMA_NAME(I.object_id) <> 'sys'
go

Its nearly 4 times faster, & less code to write. Of course if you still want to filter for User Tables the OBJECT_ID() function will work as I showed in Example 1.

This approach is especially handy when working with Dynamic Management Views which I’ll show in a future post.

Hope this encourages you to read up on some of the other SQL System functions.

Dave

 

Thought for the Post

OMG (Oh! My God) is an expression that seems to be everywhere these days. But I wonder if I could read Arabic, do they type OMA (Oh! My Allah) or is the term God sufficiently generic? Will we start seeing OMB for those who follow Buddha. OMJ (Oh! My Jehovah), OMI (Oh! My Infinity) for the Scientologists. etc

If someone says “Oh! My” are they an Atheist or just too lazy to mention the omnipotent super-being of their choice.

I assume that an Antagonist would type OYG (Oh! your God!) or perhaps OME (Oh! My Ex-god).

It was so much easier when people used four letter words to express themselves.

So if you ever find yourself texting someone who you can see from where you are sitting. Or you say “LOL” instead of actually laughing. Be aware that most of the people around you including those your own age, probably think you are a <insert expletive here>.

Comments

  • Anonymous
    December 05, 2010
    Thanks for the post especially for the mentioning about abbreviations! :)

  • Anonymous
    December 06, 2010
    Isn't your last code part of example 1 the wrong way around? That is, if the return is NOT NULL then the table already exists and your CREATE TABLE statement will fail? Or have you described the return the wrong way so that NOT NULL means it does exist? (doesn't seem likely).

  • Anonymous
    February 14, 2011
    Hi Dave, a very interesting post, and very relevant to the work I'm doing now. In my scenario, I'm automatically scripting all stored procedures, UDFs, triggers and views. I have currently been using the system tables, a la SSMS. I like your more concise example. Could you post how I could use the Object_ID function for stored procs, triggers, UDFs and views?

  • Anonymous
    February 17, 2011
    Nick: Well spotted. Clearly I fat fingered the code example which should of course read    IF OBJECT_ID(N'dbo.MyTable', N'U') IS NULL           CREATE TABLE dbo.MyTable(Rep_ID INT Null)    -- If the Object does NOT exist then Create it. Thanks Joshua        1. The 2nd parameter (ie: Object_type) is optional. I prefer to specify it, because it clearly indicates my intention possibly making my code easier to maintain. But if you really pushed I'd have a hard time justifying it. The sys.objects table is usually small, & I believe the most of the system tables are maintained in memory. So I have no evidence to say it is faster than leaving it blank.       2. Yes the OBJECT_ID() function works with all the letters listed in the TYPE column of the SYS.OBJECTS table. As described in the Books Online article ms-help://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10de_6tsql/html/f89286db-440f-4218-a828-30881ce3077a.htm     3. Some examples: USE MSDB GO SELECT OBJECT_ID('sp_addtask','P') AS [A Stored Proc] SELECT OBJECT_ID('get_principal_id','FN') AS [A Scalar Valued Function] SELECT OBJECT_ID('sysjobs_view','V') AS [A View] SELECT OBJECT_ID('syspolicy_execution_trigger','TR') AS [A Trigger] Hope this helps