Share via


T-SQL: Split String with a Twist

Introduction

 
String splitting is one of the most common problems in the Transact-SQL and programming in general. There are many blog posts discussing splitting a comma-delimited string in SQL. My favorites could be found at T-SQL Useful Links / String Split. Yet recent MSDN Transact-SQL forum post SQL-Split String brought an interesting twist to this problem and thus I want to discuss it in this article.

Problem Definition

The topic starter was kind enough to provide a table's DDL along with some data.

CREATE TABLE  #ParseString ( id INT, string VARCHAR(128) )
INSERT #ParseString VALUES (1,'A6782839 A1768983A3A6A362727323AD6A9E8BD-0D15-405C-AF10-9884BDE572E3A6133366')
INSERT #ParseString VALUES (2,'A6783834A2748983A3A4A362723313AD6A64302717-D354-48D4-9E20-ADC11EF59A1332222')

as well as the desired output

SELECT id,string,6782839 AS Storeid,1768983 AS ProductID,3 AS ProductTypeLookup,6 AS LineOfBusiness, 362727323 AS ManagerKey, 'D6A9E8BD-0D15-405C-AF10-9884BDE572E3A6133366' Tuid
FROM #ParseString
WHERE id=1
UNION
SELECT id,string,6783834 AS Storeid,2748983 AS ProductID,3 AS ProductTypeLookup,4 AS LineOfBusiness, 362723313  AS ManagerKey, 'D6A64302717-D354-48D4-9E20-ADC11EF59A1332222' Tuid
FROM #ParseString
WHERE id=2

The string needed to be split using 'A' as a splitting symbol into 6 parts. Last part also included A as part of the string.

Solution

The first idea that came to mind was to use any of the splitting functions available to split using 'A' character as a splitting character. However, I decided against that idea because of the last part where I would have to concatenate the rest of the string. Therefore I decided to use recursive common table expression to do the splitting that would include both the word and the rest of the string. After that I used CASE based pivot idea to get the desired result:

;
 
WITH cte
AS (
    SELECT id
        ,string
        ,substring(string, 2, charindex('A', substring(string, 2, len(string) - 1)) - 1) AS  Word
        ,substring(string, charindex('A', substring(string, 2, len(string) - 1)) + 2, len(string)) AS  Rest
        ,0 AS  [Level]
    FROM #ParseString
     
    UNION ALL
     
    SELECT id
        ,string
        ,substring(Rest, 1, charindex('A', rest) - 1) AS  word
        ,substring(Rest, charindex('A', rest) + 1, len(rest)) AS Rest
        ,[Level] + 1
    FROM cte
    WHERE Rest LIKE  '%A%'
    )
SELECT id
    ,string
    ,max(CASE
            WHEN [Level] = 0
                THEN Word
            END) AS  StoreID
    ,max(CASE
            WHEN [Level] = 1
                THEN Word
            END) AS  ProductID
    ,max(CASE
            WHEN [Level] = 2
                THEN Word
            END) AS  ProductTypeLookup
    ,max(CASE
            WHEN [Level] = 3
                THEN Word
            END) AS  LineOfBusiness
    ,max(CASE
            WHEN [Level] = 4
                THEN Word
            END) AS  ManagerKey
    ,max(CASE
            WHEN [Level] = 4
                THEN Rest
            END) AS  tUID
FROM cte
GROUP BY  id
    ,String

which produces our desired result:

 

 In this solution we are using CHARINDEX and SUBSTRING functions continuously to get each word and the rest of the string. That rest of the string is used for the tUID column.

Conclusion

As we saw in that problem, sometimes it makes sense to apply recursive solution directly rather than using the existing splitting solution with concatenation at the end. It will be interesting to hear from the topic starter which of the suggested solutions performed better.
 

See Also

This article participated in the TechNet Guru Contributions for March, 2014 and won Bronze medal.