python to extract/manipulate text data DIRECTLY with MS SQL

Thai Huynh 21 Reputation points
2021-01-12T21:50:45.667+00:00

Currently I use basic SQL's string functions (PATINDEX and SUSTR) to extract text data.

From what I understand, MS SQL does not have full implementation of regex which makes it very difficult to extract data from string.

Some people suggested Python because it's more suited for text string manipulation.

So my question has two parts:

  1. Is there SQL capabilities that I am not aware of? I would prefer to stay native with SQL.
  2. Is there SQL capabilities that I am not aware of? I would prefer to stay native with SQL.
    How easy or difficult is it to go with Python route? And can I read/write directly to SQL?
SQL Server
SQL Server
A family of Microsoft relational database management and analysis systems for e-commerce, line-of-business, and data warehousing solutions.
14,490 questions
0 comments No comments
{count} votes

Accepted answer
  1. Erland Sommarskog 117.4K Reputation points MVP
    2021-01-12T22:33:01.597+00:00

    If you are on SQL 2017, you can use Python from within SQL Server. It is an optional install, though, so you may not have it. But you can re-run Setup to add it. Once you have it place, you need to run

    sp_configure 'external scripts enabled', 1
    RECONFIGURE
    

    Here is a sample script where I use Python to remove all non-digits from a script:

    USE tempdb
    go
    CREATE TABLE testtabell(id       int           NOT NULL PRIMARY KEY, 
                            litetext varchar(20)   NOT NULL)
    
    INSERT testtabell(id, litetext)
       VALUES(1, 'ABC7890DEF'), (2, '123 00 00'), (3, 'Ingasiffror'), (4, '1234567890')
    
    EXEC sp_execute_external_script @language = N'Python', 
         @script = N'import re, pandas
    Ret = InputDataSet
    Ret["litetext"] = pandas.Series([re.sub("[^0-9]", "", i) for i in Ret["litetext"]])
    ', 
         @input_data_1 = N'SELECT id, litetext FROM testtabell', 
         @output_data_1_name = N'Ret'
         WITH RESULT SETS ((id       int, 
                            litetext varchar(10)))
    go
    DROP TABLE testtabell
    
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.