Strip non numeric characters from string

RJ 206 Reputation points
2025-01-03T17:20:48.4066667+00:00

Hi there, Happy New Year!

I'm creating a new table (billions of records). I need to strip non numeric values from a column.

Input '.I82807- R%'
Output 82807

i created a function based on online however ITS SLOOOOOWWWWWW....

Wanted to check if there is any faster way to extract only numbers - row level operation.

CREATE FUNCTION [dbo].[StripNonNumerics]
(
  @Temp varchar(255)
)
RETURNS varchar(255)
AS
Begin

    Declare @KeepValues as varchar(50)
    Set @KeepValues = '%[^0-9]%'
    While PatIndex(@KeepValues, @Temp) > 0
        Set @Temp = Stuff(@Temp, PatIndex(@KeepValues, @Temp), 1, '')

    Return @Temp
End

Thanks,

Azure SQL Database
Transact-SQL
Transact-SQL
A Microsoft extension to the ANSI SQL language that includes procedural programming, local variables, and various support functions.
4,675 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Bruce (SqlWork.com) 69,121 Reputation points
    2025-01-03T17:29:45.1+00:00

    It is probably updating billions of rows in a single set operation rather than the function itself. Often you can improve performance but updating subsets of rows in a driver loop. This allows the update to be done in memory. But as an update is a delete and insert with logging, it will not be quick.


  2. Viorel 118.9K Reputation points
    2025-01-03T17:35:00.8766667+00:00

    Check if the next combination of functions is also too slow:

    declare @sampleTable table ( input varchar(max))
    
    insert @sampleTable values
    ( '.I82807- R%' ),
    ( 'another test 123 456.' ),
    ( 'abc'),
    ( '1234567890')
    
    
    select input, replace( translate( input, t.s, replicate( '*', len( t.s))), '*', '') as [output]
    from @sampleTable
    cross apply ( values ( replace( translate( input, '0123456789', '**********'), '*', ''))) t(s)
    

  3. Erland Sommarskog 115.1K Reputation points MVP
    2025-01-03T22:16:59.53+00:00

    Here is another solution, using Viorel's test data. I'm forcing a binary collation to speed things up a little bit.

    declare @sampleTable table ( input varchar(max))
    
    insert @sampleTable values
    ( '.I82807- R%' ),
    ( 'another test 123 456.' ),
    ( 'abc'),
    ( '1234567890')
    
    
    ; WITH nospace AS (
       SELECT nospace = replace(input COLLATE Latin1_General_BIN2, ' ', '')
       FROM   @sampleTable
    ), stripfirst AS (
       SELECT nospace, stripfirst = reverse(substring(nospace, patindex('%[0-9]%', nospace), len(nospace)))
       FROM   nospace
    )
    SELECT nospace, stripfirst, reverse(substring(stripfirst, patindex('%[0-9]%', stripfirst), len(stripfirst)))
    FROM   stripfirst
    
    

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.