SOUNDEX (Transact-SQL)
Applies to: SQL Server Azure SQL Database Azure SQL Managed Instance Azure Synapse Analytics Analytics Platform System (PDW) SQL analytics endpoint in Microsoft Fabric Warehouse in Microsoft Fabric
Returns a four-character (SOUNDEX
) code to evaluate the similarity of two strings.
Transact-SQL syntax conventions
Syntax
SOUNDEX ( character_expression )
Arguments
character_expression
An alphanumeric expression of character data. character_expression can be a constant, variable, or column.
Return types
varchar
Remarks
SOUNDEX
converts an alphanumeric string to a four-character code that is based on how the string sounds when spoken in English. The first character of the code is the first character of character_expression, converted to upper case. The second through fourth characters of the code are numbers that represent the letters in the expression. The letters A
, E
, I
, O
, U
, H
, W
, and Y
are ignored unless they're the first letter of the string. Zeroes are added at the end if necessary to produce a four-character code. For more information about the SOUNDEX
code, see The Soundex Indexing System.
SOUNDEX
codes from different strings can be compared to see how similar the strings sound when spoken. The DIFFERENCE() function performs a SOUNDEX
on two strings, and returns an integer that represents how similar the SOUNDEX
codes are for those strings.
SOUNDEX
is collation sensitive. String functions can be nested.
SOUNDEX compatibility
In previous versions of SQL Server, the SOUNDEX
function applied a subset of the SOUNDEX
rules. Under database compatibility level 110 or higher, SQL Server applies a more complete set of the rules.
After upgrading to compatibility level 110 or higher, you might need to rebuild the indexes, heaps, or CHECK
constraints that use the SOUNDEX
function.
A heap that contains a persisted computed column defined with
SOUNDEX
can't be queried until the heap is rebuilt by running the following statement:ALTER TABLE <table> REBUILD;
CHECK
constraints defined withSOUNDEX
are disabled upon upgrade. To enable the constraint, run the following statement:ALTER TABLE <table> WITH CHECK CHECK CONSTRAINT ALL;
Indexes (including indexed views) that contain a persisted computed column defined with
SOUNDEX
can't be queried until the index is rebuilt by running the following statement:ALTER INDEX ALL ON <object> REBUILD;
Examples
A. Use SOUNDEX
The following example shows the SOUNDEX
function and the related DIFFERENCE
function. In the first example, the standard SOUNDEX
values are returned for all consonants. Returning the SOUNDEX
for Smith
and Smythe
returns the same SOUNDEX
result because all vowels, the letter y
, doubled letters, and the letter h
, aren't included.
SELECT SOUNDEX('Smith'),
SOUNDEX('Smythe');
Here's the result set. Valid for a Latin1_General
collation.
S530 S530
B. Use DIFFERENCE
The DIFFERENCE
function compares the difference of the SOUNDEX
pattern results. The following example shows two strings that differ only in vowels. The difference returned is 4
, the lowest possible difference.
SELECT DIFFERENCE('Smithers', 'Smythers');
GO
Here's the result set. Valid for a Latin1_General
collation.
4
In the following example, the strings differ in consonants; therefore, the difference returned is 2
, the greater difference.
SELECT DIFFERENCE('Anothers', 'Brothers');
GO
Here's the result set. Valid for a Latin1_General
collation.
2