다음을 통해 공유


T-SQL: Splitting a String into multiple columns

Introduction

When you have an easy to use T-SQL script that separate data fields received from an application or external data source as a Web Service or a data file (TXT or .cvs) delimited by a tab character are very useful.

This article presents a scalar user-defined function to handle data appropriately for subsequent storage in one or more tables in your database. 

It's also possible to separate and isolate data from one same text column (for datatypes: char, nchar, varchar or nvarchar) stored in your database for different tasks and identify information, eg: identifying a book edition or other ISBN publications.

Creating Function

To create this function, we use separator character position in relation to string through the CHARINDEX and SUBSTRING methods.

This also helps to identify the value contained in a column that's specified by the user.

See image below

See below T-SQL script


CREATE FUNCTION dbo.UFN_SEPARATES_COLUMNS(
 @TEXT      varchar(8000)
,@COLUMN    tinyint
,@SEPARATOR char(1)
)RETURNS varchar(8000)
AS
  BEGIN
       DECLARE @POS_START  int = 1
       DECLARE @POS_END    int = CHARINDEX(@SEPARATOR, @TEXT, @POS_START)
 
       WHILE (@COLUMN >1 AND @POS_END> 0)
         BEGIN
             SET @POS_START = @POS_END + 1
             SET @POS_END = CHARINDEX(@SEPARATOR, @TEXT, @POS_START)
             SET @COLUMN = @COLUMN - 1
         END 
 
       IF @COLUMN > 1  SET @POS_START = LEN(@TEXT) + 1
       IF @POS_END = 0 SET @POS_END = LEN(@TEXT) + 1 
 
       RETURN SUBSTRING (@TEXT, @POS_START, @POS_END - @POS_START)
  END
GO

There are many ways to check if data is being properly collected on import data, verify data and split as needed by structure table.

It's not necessary to collect, separate and format all data that was received, thus become the whole process of getting data fast and clean, only what really matters to be stored.

See below two samples using this function:

  • Using data imported from a file, and;
  • Using data from a variable or column table

Importing Data

In this sample, we can identify data from a row separated by a character, but you get only values for fields as indicated in the table below:

 Position   Column Name   Description
 #1  CD_PERSON  Person Code
 #2  NM_MAIL   Email at Work
 #3  DT_CREATED  Date Creation on Table

See below for the sample. A variable with several data separated using "|" character.

See image below

See below sample where a row is separated into columns


DECLARE @IMPORTROW         varchar(500)
 
DECLARE @CD_PERSON         int
DECLARE @NM_MAIL           varchar(25)
DECLARE @DT_CREATED        date 
 
SET @IMPORTROW = '154198|2014-01-08|durval@test.com|Comments|123456|2015-10-30|'
 
SELECT
  @CD_PERSON = dbo.UFN_SEPARATES_COLUMNS(@IMPORTROW, 1, '|'),
  @DT_CREATED = CAST(dbo.UFN_SEPARATES_COLUMNS(@IMPORTROW, 2, '|') AS DATE),
  @NM_MAIL = dbo.UFN_SEPARATES_COLUMNS(@IMPORTROW, 3, '|');
 
--THROUGH THESE VARIABLE, YOU CAN CHANGE
--YOUR DATA AFTER BEING SEPARATED
SELECT @CD_PERSON AS CD_PERSON, @NM_MAIL AS NM_MAIL, FORMAT(@DT_CREATED, 'MM/dd/yyyy') AS DT_CREATED;
GO

Splitting a Data Structure

Another way to use this function is to get elements of a structure known from a variable or a column table.

Easy step to a complete understanding, you can use an ISBN code (International Standard Book Number) and separate its elements according to their documentation.

See below this sample separating elements of an ISBN code by "-" character.

See image below

See below sample separating elements of an ISBN code


--GET ELEMENTS OF "ISBN" STRUCTURE
DECLARE @ISBN varchar(20) = '978-0-571-08989-5'
SELECT
  dbo.UFN_SEPARATES_COLUMNS(@ISBN, 1, '-') AS PREFIX,
  dbo.UFN_SEPARATES_COLUMNS(@ISBN, 2, '-') AS REGISTRATION_GROUP,
  dbo.UFN_SEPARATES_COLUMNS(@ISBN, 3, '-') AS REGISTRANT,
  dbo.UFN_SEPARATES_COLUMNS(@ISBN, 4, '-') AS PUBLICATION,
  dbo.UFN_SEPARATES_COLUMNS(@ISBN, 5, '-') AS [CHECK]
GO

Conclusion

It's important quickly identify and convert data to improve conditions of data storage and also data view.

For a more generic format that's a code composed of letters and/or numbers, it's possible to split your content and then obtain more precise information about a data row.

References

See Also

Other Languages