T-SQL: Random String
Introduction
In this article we are going to show several logistics on how to build a random string. This is very useful for maintenance tasks like testing (Populate large tables with random values), generate random password and so on...
If you have any other way of doing it, then you are most welcome to edit this article and give us your insight :-)
Solutions
Let's examine several solutions. Those solutions came from forum's users, and we will try to put them into perspective of advantages & disadvantages. We will close the article with conclusions and recommendations. If you are just looking for the best solution then you can jump to the end.
1. Using NEWID as base string & NEWID to generate a random length
Basic Idea
- Create random string using the function NEWID (), this will give us a random 36 characters string.
- Create a random number using the function NEWID, as the string length.
- Cut the string using the function LEFT
Code
DECLARE @StringMaxLen int = 12
SELECT TOP (1000)
LEFT (CAST (NEWID () AS NVARCHAR(MAX)) , ABS (CHECKSUM (NEWID ())) % @StringMaxLen + 1)
FROM SYS.OBJECTS A
CROSS JOIN SYS.OBJECTS B
Advantages & Disadvantages
- A: Very fast executing
- A: Very fast writing the code
- A: No need to create UDF (if someone care about this)
- D: NOT a Random solution!
- The converting of NEWID to NVARCHAR generate a string with a specific format: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.
There for we are going to get in the string always the same format. For example Ninth chain character will always be a dash...
- The converting of NEWID to NVARCHAR generate a string with a specific format: XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX.
- D: NOT the same odds!
- As explained above, we get more chance of getting a dash sign.
- D: Limited to a maximum length of 36 characters!
- We can't chose length more the the NEWID length (36 characters), and we probably want to build 4000 characters sometimes.
- Theoretically, we can always join chains in order to get the length we need. This will fix the problem of "limited maximum length", but at the same time will create a problem of Inflexible, as we will need to build the query according to the maximum length.
- D: Limit of characters that can get in results!
- We can get only characters that might be in NEWID value. We can't get any other character! The only characters that we can get are: uppercase English, numbers, and the sign "-"
2. Using Clean NEWID as base string & NEWID to generate a random length.
Basic Idea
- Create random string using the function NEWID, this will give us a random 36 characters string.
- Clean the dash character, this will give us a random 32 characters string.
- Create a random number using the function NEWID, as the string length.
- Cut the string using the function LEFT
Code
DECLARE @StringMaxLen int = 12
SELECT TOP (1000)
LEFT (REPLACE(CAST (NEWID () AS NVARCHAR(MAX)),'-','') , ABS (CHECKSUM (NEWID ())) % @StringMaxLen + 1)
FROM SYS.OBJECTS A
CROSS JOIN SYS.OBJECTS B
GO
Advantages & Disadvantages
- advantages same as before.
- disadvantage same as before basically, except the dash character problem but the max length is now only 32 characters.
3. Performing data manipulation on an existing data
Basic Idea
We can use an existing data, which is not random as a base string. Then we use text manipulation like "data scrambling", "data parsing", "random sorting" and so on, in order to get a "look like random data".
* This idea can be improved significantly scale by using an existing random data table!
Code
;WITH cte_1 AS(
SELECT ROW_NUMBER() OVER (ORDER BY NEWID() ASC) AS RN, t.name
FROM sys.tables AS t
CROSS JOIN sys.tables AS tt
),
cte_2 AS(
SELECT ROW_NUMBER() OVER (ORDER BY NEWID() ASC) AS RN, t.name
FROM sys.columns AS t
CROSS JOIN sys.columns AS tt
)
SELECT
cte_1.name + cte_2.name AS RandomString1,
REPLICATE(cte_1.name + cte_2.name,CASE WHEN ABS (CHECKSUM (NEWID ())) % 4 = 0 THEN 1 ELSE ABS (CHECKSUM (NEWID ())) % 4 + 1 END) AS RandomString2
FROM cte_1
INNER JOIN cte_2
ON cte_1.RN = cte_2.RN
In the example above we just used the tables name in the system as a base strings for manipulation. This is only an example as this idea (using existing data) can be done in any way and on any tables that we want.
Advantages & Disadvantages
- D: NOT a Random solution & NOT the same odds!
- The solution is based on existing data with the more manipulation we do, the more we can make this more "look like" a random data.
- D: Limited to a maximum length of the existing data!
- D: Limit of characters that can get in results!
- We can get only characters that might be in the existing data.
- D: Slow and inefficient as the number of manipulations on the text exceeds.
4. Using Random CHAR-> Using Loop to build a flexible string length
Basic Idea
We are using a UDF to create a single random string. the function get 2 parameters: (A) the maximum length of the String (B) Do we need to create a string as long as the maximum or randomly length.
Code
/******************************
* Version("2.0.0.0")
* FileVersion("2.0.0.0")
* WrittenBy("Ronen Ariely")
* WebSite("http://ariely.info/")
* Blog("http://ariely.info/Blog/tabid/83/language/en-US/Default.aspx")
******************************/
CREATE function [dbo].[ArielyRandomStringFunc_Ver2.0.0](@NumberOfChar int, @IsFixedLength bit = true)
returns nvarchar(MAX)
WITH EXECUTE AS CALLER
AS
begin
DECLARE @TotalNumberOfCharToReturn int
IF (@IsFixedLength = 1)
SET @TotalNumberOfCharToReturn = @NumberOfChar
ELSE
-- I am using my own random function
-- you can read more about the reason here:
-- Using side-effecting build in functions inside a UDF (your function)
-- http://ariely.info/Blog/tabid/83/EntryId/121/Using-side-effecting-build-in-functions-inside-a-UDF-your-function.aspx
SET @TotalNumberOfCharToReturn = CONVERT(int,(AccessoriesDB.dbo.ArielyRandFunc() * @NumberOfChar) +1)
declare @Out as nvarchar(MAX) = ''
declare @QQ01 as int = 0
while @QQ01 < @TotalNumberOfCharToReturn begin
set @QQ01 += 1
-- This is in-secure Function as we chose any unicode character without filtering!
-- I preferred to choose from secured characters list as I show in preview function (ver1.0.0)
--
-- 65535: Maximum UNICODE character value
-- You can limit this value to your own language's values or your needs
-- Numbers: 48 - 58
-- English uppercase: 65 - 91
-- English lowercase: 97 - 123
-- Hebrew: 1488 - 1515
select @Out += ISNULL(NCHAR(CAST(65535 * AccessoriesDB.dbo.ArielyRandFunc() AS INT)),'')
end
--print @Out
RETURN @Out
end
Advantages & Disadvantages
- A: Relatively fast.
- A: Full Random
- A: No length limit
- A: No characters limit
- D: No filtering option for security
5. Selecting from characters list-> Using Loop to build a flexible string length
Basic Idea
The basic idea is same as above, with the option for filtering characters, as we chose from a list. We are choosing a random number in order to chose the character in that position on our list. We use a loop to build the entire string.
Code
/******************************
* Version("1.0.0.0")
* FileVersion("1.0.0.0")
* WrittenBy("Ronen Ariely")
* WebSite("http://ariely.info/")
* Blog("http://ariely.info/Blog/tabid/83/language/en-US/Default.aspx")
******************************/
CREATE function [dbo].[ArielyRandomStringFunc_Ver1.0.0](@NumberOfChar int, @IsFixedLength bit = true)
returns nvarchar(MAX)
WITH EXECUTE AS CALLER
AS
begin
DECLARE @TotalNumberOfCharToReturn int
IF (@IsFixedLength = 1)
SET @TotalNumberOfCharToReturn = @NumberOfChar
ELSE
-- I am using my own random function
-- you can read more about the reasons here:
-- Using side-effecting build in functions inside a UDF (your function)
-- http://ariely.info/Blog/tabid/83/EntryId/121/Using-side-effecting-build-in-functions-inside-a-UDF-your-function.aspx
SET @TotalNumberOfCharToReturn = CONVERT(int,(AccessoriesDB.dbo.ArielyRandFunc() * @NumberOfChar) +1)
DECLARE @AllChar as nvarchar(MAX) = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890אבגדהוזחטיכלמסעפצקרשת'
-- it is faster to use a table with ID column and char column,
-- index by ID column
-- order the table in random order
-- get the top @NumberOfChar records
-- but in this case we will get a select without repeating a value!
-- if we want "with repeating" then we have to re-chose each character separately
-- We can just use select number and chose char(random number) to get random char
-- It is faster
-- But this is in-secure and I preferred to chose from secured characters list here
declare @MyRnd as int
declare @Out as nvarchar(MAX) = ''
declare @QQ01 as int = 0
while @QQ01 < @TotalNumberOfCharToReturn begin
set @QQ01 += 1
set @MyRnd = (SELECT ((RandNumber * LEN(@AllChar)) +1) FROM ArielyRandView)
select @Out += SUBSTRING(@AllChar,@MyRnd,1)
--print SUBSTRING(@AllChar,@MyRnd,1)
end
--print @Out
RETURN @Out
end
Advantages & Disadvantages
- A: Relatively fast.
- A: Full Random
- A: No length limit
- A: No characters limit
- A: Filtering option for security
6. Building a fix length random string using NEWID and NCHAR->Cut randomly using LEFT
Basic Idea
We build a String manually by joining a fix number of random characters.
Code
SELECT top 1000
LEFT (
NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
+ NCHAR(CAST(1000 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT))
, 1 + CAST(29 * RAND( ABS(CHECKSUM(CAST(NEWID() AS VARCHAR(100)))) ) AS INT)
) AS Str
from sys.columns a CROSS JOIN sys.columns b
Advantages & Disadvantages
- A: Fast execution solution.
- D: Very hard for coding, if we need long length string.
* We can build a dynamic query for this solution to get more flexible solution. - D: Length limit by the number of char in the query
- A: No characters limit
- D: No Filtering option for security
7. Using CLR Function
Basic Idea
The option of using CLR function added in SQL 2005 and yet most DBAs do not use it. DBAs have to internalize the extreme improvement and differences that can be (sometimes) by use CLR! While SQL Server works well with SET of DATA, CLR work much better in manipulating strings (Split, Regular expression...).
Code
http://social.technet.microsoft.com/wiki/contents/articles/21219.sql-server-create-random-string-using-clr.aspx
* Links to other versions can be seen on the resources
Advantages & Disadvantages
- A: VERY FAST.
- A: Extremely flexible.
- No Length limit
- No characters limit
- A: Filtering option for security
- D: needed to enable the use of CLR (hopefully you have done it already!)
Conclusions and Recommendations
- Without a doubt, CLR function is the best solution! If you can use it, then chose it. Tests have shown that this function can produce in less than 2 seconds what other functions have not been able to produce in more than 20 minutes (The execution was terminated after 20 minutes). This solution meat with any requirement.
- It is highly recommended not to use solution without filtering mechanism! several UNICODE characters might be harmful in some situation. You can get more information about the problematic CHAR ZERO For example in this link [Hebrew].
- If you need (A) fast query (B) flexible & unlimited length or (C) Filtering mechanism to chose the characters that can be use, then use solution 5 or change a bit solution 4 to add filtering.
- If you need (A) fast query and (B)short max length string and (C) you have to use all the characters range, then you can use solution 6.
Resources
This article is based on this forum discussion:
http://social.technet.microsoft.com/Forums/he-IL/6759c653-a716-4602-b184-09f38de0e177/-nvarchar-?forum=sqlhe
Create Random String Using CLR [DOT.NET 4.0+, using LINQ]
http://social.technet.microsoft.com/wiki/contents/articles/21219.sql-server-create-random-string-using-clr.aspx
Create Random String Using CLR, Including (1)version using LINQ, (2)Old version for DOT.NET 2.0+
http://ariely.info/Blog/tabid/83/EntryId/134/SQL-Random-String-using-CLR.aspx
.
Contributed to the original discussion:
Ivan Radchenko, Tomer Shtrum, Ronen Ariely, Guy Glantser, Yossi Hakikat
.
======
======