REGEXP_SUBSTR (Transact-SQL)
Applies to:
Azure SQL Database
SQL database in Microsoft Fabric
Note
As a preview feature, the technology presented in this article is subject to Supplemental Terms of Use for Microsoft Azure Previews.
Returns one occurrence of a substring of a string that matches the regular expression pattern. If no match is found, it returns NULL.
REGEXP_SUBSTR
(
string_expression,
pattern_expression [, start [, occurrence [, flags [, group ] ] ] ]
)
Arguments
string_expression
An expression of a character string.
Can be a constant, variable, or column of character string.
Data types: char, nchar, varchar, or nvarchar.
pattern_expression
Regular expression pattern to match. Usually a text literal
Data types: char, nchar, varchar, or nvarchar. pattern_expression
supports a maximum character length of 8,000 bytes.
start
Specify the starting position for the search within the search string. Optional. Type is int or bigint.
The numbering is 1-based, meaning the first character in the expression is 1
and the value must be >= 1
. If the start expression is less than 1
, returns error. If the start expression is greater than the length of string_expression
, the function returns NULL
. The default is 1
.
occurrence
An expression (positive integer) that specifies which occurrence of the pattern expression within the source string to be searched or replaced. Default is 1
. Searches at the first character of the string_expression
. For a positive integer n
, it searches for the nth
occurrence beginning with the first character following the first occurrence of the pattern_expression
, and so forth.
flags
One or more characters that specify the modifiers used for searching for matches. Type is varchar or char, with a maximum of 30 characters.
For example, ims
. The default is c
. If an empty string (' ')
is provided, it will be treated as the default value ('c')
. Supply c
or any other character expressions. If flag contains multiple contradictory characters, then SQL Server uses the last character.
For example, if you specify ic
the regex returns case-sensitive matching.
If the value contains a character other than those listed at Supported flag values, the query returns an error like the following example:
Invalid flag provided. '<invalid character>' are not valid flags. Only {c,i,s,m} flags are valid.
Supported flag values
Flag | Description |
---|---|
i | Case-insensitive (default false) |
m | Multi-line mode: ^ and $ match begin/end line in addition to begin/end text (default false) |
s | Let . match \n (default false) |
c | Case-insensitive (default true) |
group
Specifies which capture group (subexpression
) of a pattern_expression
determines the position within string_expression
to return. The capture group (subexpression
) is a fragment of pattern enclosed in parentheses and can be nested. The capture groups are numbered in the order in which their left parentheses appear. The data type of group will be integer and the value must be greater than or equal to 0 and must not be greater than the number of capture groups (subexpressions) in pattern_expression. The default group value is 0, which indicates that the position is based on the string that matches the entire pattern.
Return value
String.
Examples
Extract the domain name from an email address.
SELECT REGEXP_SUBSTR (EMAIL, '@(.+)$', 1, 1, 'i', 1) AS DOMAIN FROM CUSTOMERS;
Find the first word in a sentence that starts with a vowel.
SELECT REGEXP_SUBSTR (COMMENT, '\b[aeiou]\w*', 1, 1, 'i') AS WORD FROM FEEDBACK;
Get the last four digits of a credit card number.
SELECT REGEXP_SUBSTR (CARD_NUMBER, '\d{4}$') AS LAST_FOUR FROM PAYMENTS;