다음을 통해 공유


.NET: Defensive data programming (Part 2)

Series

Introduction

This article will walk through the process required to prevent inserting duplicate records or update a record. A simple example would be a country table having one record per country. There would be no reason to have a country name entered more than once.

One culprit for adding a duplicate record is inserting directly into a table from SQL-Server Management Studio while another would be providing an administrator the ability to add new records without any form of assertion in an application.

When adding a new record in SQL-Server Management Studio the best way to prevent duplicate records is to create a unique constraint to prevent a duplicate record. The constraint can be used in an application also which will be explored done incorrectly and done correctly.

Incorrectly handling constraint violations with INSERT

Many developers will place a try/catch surrounding the operation which would be to create a connection and command object suitable for inserting a new record and returning false from a method if an exception is thrown. Doing so will still increment the sequence e.g. if the last primary key value was 2 and the insert failed upon the next successful insert the new primary key value would not be 3 but instead 3.

Other developers will think that the best course of action is to write a lot of code to determine why the exception happened even though the message of the thrown exception indicates exactly what the problem is. Their code may follow the pattern below to get information back to the user attempting to add a new record.

C# Example

public bool  InsertCountry(string  pCountryName, ref  int pIdentifier, ref string  pError)
{
    using (var cn = new SqlConnection() {ConnectionString = ConnectionString})
    {
        using (var cmd = new SqlCommand() {Connection = cn})
        {
            var insertStatement = "INSERT INTO dbo.Country (Name)  VALUES (@Name);" +
                                    "SELECT CAST(scope_identity() AS int);";
 
            try
            {
                cmd.CommandText = insertStatement;
                cmd.Parameters.AddWithValue("@Name", pCountryName);
 
                cn.Open();
 
                pIdentifier = Convert.ToInt32(cmd.ExecuteScalar());
                return true;
            }
            catch (SqlException ex)
            {
                string message = null;
                string tableName = "";
                string indexName = "";
 
                /*
                    * We already know the value but if you want to get
                    * into some regx this shows how to parse the value.
                    */
                DuplicateCountryValue = Regex.Match(ex.Message, "\\(([^)]*)\\)").Groups[1].Value;
 
                /*
                    * Get the table name 'country' which we have in the INSERT INTO
                    */
                var match = Regex.Match(ex.Message, @"'([^']*)");
                if (match.Success)
                {
                    tableName = match.Groups[1].Value;
                }
 
 
                if (ex.Number == 2601)
                {
 
                    pError = $"Can not add '{DuplicateCountryValue}' into '{tableName}' since it already exists.";
                    // if you needed the index involved with the error
                    indexName = GetIndexKeys(cmd, ex.Message, "Country");
                }
 
                mHasException = true;
                mLastException = ex;
 
                return false;
            }
        }
    }
}

VB.NET Example

Public Function  InsertCountry(ByVal  pCountryName As  String, ByRef pIdentifier As Integer, ByRef  pError As  String) As Boolean
    Using cn = New  SqlConnection() With  {.ConnectionString = ConnectionString}
        Using cmd = New  SqlCommand() With  {.Connection = cn}
            Dim insertStatement = "INSERT INTO dbo.Country (Name)  VALUES (@Name);" & "SELECT CAST(scope_identity() AS int);"
 
            Try
                cmd.CommandText = insertStatement
                cmd.Parameters.AddWithValue("@Name", pCountryName)
 
                cn.Open()
 
                pIdentifier = Convert.ToInt32(cmd.ExecuteScalar())
                Return True
            Catch ex As SqlException
                Dim message As String  = Nothing
                Dim tableName As String  = ""
                Dim indexName As String  = ""
 
                '                        
                ' * We already know the value but if you want to get
                ' * into some regx this shows how to parse the value.
                '                         
                DuplicateCountryValue = Regex.Match(ex.Message, "\(([^)]*)\)").Groups(1).Value
 
                '                        
                ' * Get the table name 'country' which we have in the INSERT INTO
                '                         
                Dim match = Regex.Match(ex.Message, "'([^']*)")
                If match.Success Then
                    tableName = match.Groups(1).Value
                End If
 
 
                If ex.Number = 2601 Then
 
                    pError = $"Can not add '{DuplicateCountryValue}' into '{tableName}' since it already exists."
                    ' if you needed the index involved with the error
                    indexName = GetIndexKeys(cmd, ex.Message, "Country")
                End If
 
                mHasException = True
                mLastException = ex
 
                Return False
            End Try
        End Using
    End Using
End Function

Another exception to watch for is:

Violation of UNIQUE KEY constraint 'UQ_Person1FirstLastName'. Cannot insert duplicate key in object 'dbo.Persons1'. The duplicate key value is (Karen, Payne).
The statement has been terminated.

A developer will tend to wrap this into a Try/Catch and show the error message to the user.  While another developer would go to extremes to figure out what want wrong as shown here.

C# example

public bool  Update(string  pFirstName, string  pLastName, int  pIdentifier)
{
    using (var cn = new SqlConnection() { ConnectionString = SqlConnectionString })
    {
        var statement = "UPDATE dbo.Persons1 SET FirstName = @FirstName,LastName = @LastName  WHERE id = @Id";
        using (var cmd = new SqlCommand() { Connection = cn, CommandText = statement })
        {
 
            cmd.Parameters.AddWithValue("@FirstName", pFirstName);
            cmd.Parameters.AddWithValue("@LastName", pLastName);
            cmd.Parameters.AddWithValue("@id", pIdentifier);
 
            try
            {
                cn.Open();
                cmd.ExecuteNonQuery();
                return true;
 
            }
            catch (SqlException ex)
            {
                string message = null;
                int pos = 0;
 
                ConstraintValue = Regex.Match(ex.Message, "\\(([^)]*)\\)").Groups[1].Value;
 
                pos = ex.Message.IndexOf(".", StringComparison.Ordinal);
                message = ex.Message.Substring(0, pos);
 
                if (message.Contains("Violation of UNIQUE KEY constraint"))
                {
                    message = message.Replace("Violation of UNIQUE KEY constraint", "").Replace("'", "");
                    pos = message.LastIndexOf("_", StringComparison.Ordinal) + 1;
                    ConstraintColumnName = message.Substring(pos);
                }
 
                HasException = true;
                Exception = ex;
                return false;
            }
        }
    }
}

Summary of issues

There are several things wrong with this approach, first off the code presented returns the table name and values which violated the constraint and secondly this code overly complicates the method with no true value in return. With all this code on failure, the auto-incrementing sequence is advanced as done in the first idea to wrap the operation in a simple try-catch.

The correct method to handle constraint violations

The proper way to handle preventing duplications against a table constraint is to simply check if the constraint may be validated prior that a brute force insert by querying the table e.g.

SELECT 1 FROM dbo.Country WHERE Name  = @Name

If the user entered 'Brazil' and 'Brazil' already existed in table 1 is the returned value and if the country name does not exist the return value is null. If null is returned then proceed with the INSERT, otherwise report the issue back to the caller which may be a button click. If there is a duplicate and recognized as one the INSERT statement is never invoked thus the primary key sequence is not incremented.

C# Example

/// <summary>
/// Insert new record the right way by first determing if the country name
/// is present in the database.
/// </summary>
/// <param name="pCountryName">Country name to insert</param>
/// <param name="pIdentifier">New primary key</param>
/// <param name="pError">Error message on failure</param>
/// <returns>Success</returns>
public bool  InsertCountry1(string pCountryName, ref int  pIdentifier, ref  string pError)
{
    using (var cn = new SqlConnection() { ConnectionString = ConnectionString })
    {
        using (var cmd = new SqlCommand() { Connection = cn })
        {
            var selectStatement = "SELECT 1 FROM dbo.Country WHERE Name = @Name";
 
            var insertStatement = "INSERT INTO dbo.Country (Name)  VALUES (@Name);" +
                                    "SELECT CAST(scope_identity() AS int);";
 
            try
            {
                cmd.CommandText = selectStatement;
                cmd.Parameters.AddWithValue("@Name", pCountryName);
                cn.Open();
 
                if (cmd.ExecuteScalar() != null)
                {
                    pError = $"Country '{pCountryName}' already in table";
                    mHasException = false;
                    return false;
                }
 
                cmd.CommandText = insertStatement;
 
 
                pIdentifier = Convert.ToInt32(cmd.ExecuteScalar());
                return true;
            }
            catch (Exception ex)
            {
                mHasException = true;
                mLastException = ex;
 
                return false;
            }
        }
    }
}

VB.NET example

''' <summary>
''' Insert new record the right way by first determing if the country name
''' is present in the database and will not increment the primary key sequence.
''' </summary>
''' <param name="pCountryName">Country name to insert</param>
''' <param name="pIdentifier">New primary key</param>
''' <param name="pError">Error message on failure</param>
''' <returns>Success</returns>
Public Function  InsertCountry1(ByVal  pCountryName As  String, ByRef pIdentifier As Integer, ByRef  pError As  String) As Boolean
    Using cn = New  SqlConnection() With  {.ConnectionString = ConnectionString}
        Using cmd = New  SqlCommand() With  {.Connection = cn}
            Dim selectStatement = "SELECT 1 FROM ForumExample.dbo.Country WHERE Name = @Name"
 
            Dim insertStatement = "INSERT INTO dbo.Country (Name)  VALUES (@Name);" & "SELECT CAST(scope_identity() AS int);"
 
            Try
                cmd.CommandText = selectStatement
                cmd.Parameters.AddWithValue("@Name", pCountryName)
                cn.Open()
 
                If cmd.ExecuteScalar() IsNot Nothing Then
                    pError = $"Country '{pCountryName}' already in table"
                    mHasException = False
                    Return False
                End If
 
                cmd.CommandText = insertStatement
 
 
                pIdentifier = Convert.ToInt32(cmd.ExecuteScalar())
 
                Return True
 
            Catch ex As Exception
                mHasException = True
                mLastException = ex
 
                Return False
            End Try
        End Using
    End Using
End Function

Incorrectly handling constraint violations with UPDATE

What applies for inserting records applies for updating records too but generally will be with multiple columns for the constraint such as first and last name. In the script below for storing data form people a constraint UQ_Person1FirstLastName enforces against a record having first and last name that already exists in the table Person1.

USE [ForumExample]
GO
SET ANSI_NULLS ON
GO
 
SET QUOTED_IDENTIFIER ON
GO
 
CREATE TABLE  [dbo].[Persons1](
    [id] [INT] NOT NULL,
    [FirstName] [NVARCHAR](50) NULL,
    [LastName] [NVARCHAR](50) NULL,
    [GenderIdentifier] [INT] NULL,
    [IsDeleted] [BIT] NULL,
 CONSTRAINT [UQ_Person1FirstLastName] UNIQUE NONCLUSTERED 
(
    [FirstName] ASC,
    [LastName] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON  [PRIMARY]
) ON  [PRIMARY]

Rather than show incorrect methods to handle constraint violations (they are included in the source code) let's look at the proper method to handle constraint issues which use the exact same pattern as done with INSERT.

C# example

/// <summary>
/// Update person the right way by first determing if FirstName and LastName
/// will not produce a duplicate record or increment the next primary key sequence.
/// </summary>
/// <param name="pFirstName">First name to update</param>
/// <param name="pLastName">Last name to update</param>        
/// <param name="pIdentifier">Identifying key for person</param>
/// <returns>Success</returns>
public bool  PersonUpdate1(string  pFirstName, string  pLastName, int  pIdentifier)
{
 
    using (var cn = new SqlConnection() { ConnectionString = ConnectionString })
    {
        // see note 1 in information.txt
        var statement = "SELECT 1 FROM dbo.Persons1 AS p WHERE p.FirstName = @FirstName AND p.LastName = @LastName ";
 
        using (var cmd = new SqlCommand() { Connection = cn, CommandText = statement })
        {
 
            cmd.Parameters.AddWithValue("@FirstName", pFirstName);
            cmd.Parameters.AddWithValue("@LastName", pLastName);
 
 
            try
            {
 
                cn.Open();
 
                if (cmd.ExecuteScalar() == null) 
                {
                    cmd.Parameters.AddWithValue("@id", pIdentifier);
                    cmd.ExecuteNonQuery();
                    return true;
                }
                else
                {
                    return false;
                }
 
                         
 
            }
            catch (SqlException ex)
            {
 
                //
                // Proposed values for update causing the exception
                //
                ConstraintValue = Regex.Match(ex.Message, "\\(([^)]*)\\)").Groups[1].Value;
 
                /*
                    * See note 2 in Information.txt
                    */
                if (ex.Number == 2601)
                {
                    ConstraintColumns = GetIndexKeys(cmd, ex.Message,"Persons1");
                }
 
                mHasException = true;
                mLastException = ex;
 
                return false;
 
            }
            catch (Exception ex)
            {
 
                mHasException = true;
                mLastException = ex;
 
                return false;
            }
        }
    }
}

VB.NET example

''' <summary>
''' Update person the right way by first determing if FirstName and LastName
''' will not produce a duplicate record or increment the next primary key sequence.
''' </summary>
''' <param name="pFirstName">First name to update</param>
''' <param name="pLastName">Last name to update</param>
''' <param name="pIdentifier">Identifying key for person</param>
''' <returns></returns>
Public Function  Update1(pFirstName As String, pLastName As  String, pIdentifier As Integer) As  Boolean
 
    Using cn As  New SqlConnection() With {.ConnectionString = ConnectionString}
 
        Dim statement = "SELECT 1 FROM  dbo.Persons1 AS p WHERE  p.FirstName = @FirstName AND p.LastName = @LastName "
 
        Using cmd As  New SqlCommand() With {.Connection = cn, .CommandText = statement}
 
            cmd.Parameters.AddWithValue("@FirstName", pFirstName)
            cmd.Parameters.AddWithValue("@LastName", pLastName)
 
            Try
 
                cn.Open()
 
                If cmd.ExecuteScalar() Is Nothing  Then
                    cmd.Parameters.AddWithValue("@id", pIdentifier)
                    cmd.ExecuteNonQuery()
                    Return True
                Else
                    Return False
                End If
 
            Catch ex As SqlException
 
                '
                ' Proposed values for update causing the exception
                '
                ConstraintValue = Regex.Match(ex.Message, "\(([^)]*)\)").Groups(1).Value
 
 
                If ex.Number = 2601 Then
                    ConstraintColumns = GetIndexKeys(cmd, ex.Message, "Persons1")
                End If
 
                mHasException = True
                mLastException = ex
 
                Return False
 
            Catch ex As Exception
 
                mHasException = True
                mLastException = ex
 
                Return False
            End Try
        End Using
    End Using
End Function

Error Number

Note that in the examples above there is a check for ex.Number - 2601, where does this comes from? The error number comes from master..sysmessages.

SELECT  error ,
        description
FROM    master..sysmessages
WHERE   msglangid = 1033 /* eng */
        AND description LIKE '%insert%duplicate%key%'
ORDER BY  error;

Results from the query

error       description


2601        Cannot insert duplicate key row in object '%.*ls' with unique index '%.*ls'. The duplicate key value is %ls.
2627        Violation of %ls constraint '%.*ls'. Cannot insert duplicate key in object '%.*ls'. The duplicate key value is %ls.

Summary

To implement constraints is easy while there are different views on how to implement assertion within an application. Usually when there is a great deal of code to handle this type of assertion the developer does not have a clear understanding of SQL capabilities and/or rather write lots of code just because they can. This thought process is done in an environment with a single developer or a team of developers will eventually bite them sooner or later.

When working with constraints or any type of assertions with SQL it’s best to first examine all possibilities to best resolve the problem. In this article, you are now equipped with a sound pattern to handle prevention of duplicate records with no side effects. 

Source code

Following this link to GitHub project repo.

Building the solution

  • Make sure SQL-Server, Express edition or better is installed.
  • Run SqlContraintViolations/DataScripts/scriptCurrent.sql
  • Change the property DatabaseServer in BaseServerConnections class.

See also

Unique Constraints and Check Constraints  
SQL-Server Express Edition