Modern Entity Framework - Row & Column Based Security
Prepared by L.C Waikiki Team(TAYFUN.ESMER@Lcwaikiki.com) and Dr. Alp Asutay
1. Introduction
The last few years have seen the rise of Entity Framework object-relational mapper. While getting more robust and improved with each version compared to previous ones, we have been introduced to interceptors starting with Entity Framework 6. Those interceptors allow us to intervene the queries executed against SQL, the result sets coming back from SQL and even the exceptions related to the executions. If you have decided to learn more about interceptors, you must have come across a few implementations on the internet like logging and soft delete concepts. But what we aim here is much more complex and advanced scenario which is going to serve for our security policies based on rows and columns of a table.
At the end of the day, Row Based Security will enable us to implement restrictions on data row access. For example ensuring that employees can access only the data pertinent to their department, or a group of accountants can access the orders only with less than $10.000 total price. Column Based Security on the other hand, will enable us to implement restrictions on columns. It's the data will be restricted again as Row Based Security, but the target is all data under specified columns.
2. Solution & Implementation
We all know that limiting data means filtering it, which is done with the “where” keyword. So basically we will apply predicates to the set of entities when queried, at application tier and independently of developers. There are two interceptor levels for the queries before they go out the gate and get executed in SQL. One is DbCommand Interceptor, which is the SQL level and carries DbCommand object which is the very same object SqlCommand derives from. That is not the field we want to work on by altering the SQL command before it gets executed. The other one is DbCommandTree Interceptor, which reveals itself from its name and makes it possible to work with expression trees and alter them with the help of visitor objects. That’s the path we will take to achieve our goal more elegantly.
So, first things first. We are going to create policy expressions holding the main data which defines the target for security policy to be applied: environment, server, database, table (entity), column, field (property of the entity) and even the values or the source of them. You can see the data model below.
By holding the environment information along with the server and database, we can apply different policies for a user on different environments such as test, prod, development etc. DbObject and DbObjectField tables are for the entity and its property. These policy securities can be associated with claims, claim groups and users specifically, and the information for that is held in AuthorizationRef and AuthorizationDefinitionRef fields in PolicySecurity table.
The journey of the code written for the implementation of this solution begins as the user steps in our application, which is the Session_Start event for web applications. There we query all the row and column based security policies associated with the incoming user with all the data needed, transform them into DataPolicyExpression objects and store them in distributed cache which happened to be Redis Cache for us.
Notice that the PropertyValueSource value is static with a value next to it which is "2" at this example. However, the source could be External meaning the data for the expression is going to be pulled from an object which implements the interface (IExternalPolicySecurityService) this solution exposes with a method returning Dictionary<string, string> typed value, holding the key for that external values and the actual values (probably a set of IDs of other tables) how and wherever they are coming from. So in that case instead of a static value, we hold that key name of that external source.
Once the concrete class implementing that interface is found on the web project, the method of it is invoked and all the data is returned for the security policy to be used most likely with IN operator instead of equal (=). That external value source option allows us to define policy with dynamic data which needs to be executed at runtime and involves other tables. Other than these two options, we can store our policies with internal value sources, which refers to endpoints of services. Instead of a concrete class implementing a specific interface, a web service can also be used for the same reason but different technical needs. As in the external value case, that service again will return a Dictionary<string, string> typed value. Again in our case this internal value source working with web services always served us set of IDs of other tables to be used along with the IN operator.
We have two separate interceptors for row based policies and column based policies. These two interceptors are registered and waiting for any query to catch before it is executed in SQL Server. At that point, when the TreeCreated event is triggered, we pull the data from the cache and see if the executing query holds any matching entity (table). If so, the policy information stored in DataPolicyExpression object is used to build DbFilterExpression with dynamic filters, to be attached to the query. Building that expression doesn’t happen for every query hit.
Instead, when it is created for the first time, it’s stored in memory for the later hits for the same entity. That process could have done much earlier, like in the Session_Start event for every single user. But that could depend on the number of policies defined for each user, how complex the policies therefore the expression needs to be build are complex, and of course we cannot know for sure if the user is going to hit that table (entity) in that session at all. So the implementation may be different for some technical and performance purposes but at the end of the day with the help of the interceptors, we can apply new filters or replace the existing ones based on our policies.
3. Problems & Solutions
Of course we had some troubles making our code work during this implementation. Firstly, the order for the interceptors to work mattered in a case where a column which both column and row based policies are applied for. Because the data was replaced with the default value of that type –empty string in our case-instead of the column name, again that default value was used in the where clause which caused the queries to have false conditions all the time. It looked something like this:
SELECT Foo1, Foo2, N’’
FROM tb_Table
WHERE N’’ = ‘UK’
Instead of WHERE Foo3 = ‘UK’ , the column name was replaced by the default value which was put there due to a column based security for the user not having rights to see that specific column. We simply solved the problem by registering the interceptors in the right other.
Another problem we faced was the query cache mechanism of Entity Framework. The problem rise when an entity is queried more than once. When a value is set in a filter parameter, that value is cached and reused for the further queries. This happens because the interceptors we use implements IDbCommandTreeInterceptor and the execution of the TreeCreated method happens only once for the query commands, being cached after.
Even for a new instance of DbContext won’t make any difference, we’ll be stucked with the cached values for the parameters. This situation might be a problem for almost any kind of interceptor works, unless it is using a hard coded value such as true/false for soft delete purpose implementations. While there are a few different work arounds (using functions that produce queries with constants or using properties of a non-mapped objects and such) for this problem, we decided to go with the IEnumerable<TEntity> collection wich is always remain empty. Entity Framework does not cache queries that involves IEnumerable<T>.Contains<T>(T value) call against an in-memory collection, since the values of the collection are considered volatile. That method (Contains) call can be executed in a wrapped DbSet object or at a point of base repository like object for it so no matter what entity the query comes for, Entity Framework won’t replace it with the cached one.