AD FS 2.0: Using RegEx in the Claims Rule Language
An Introduction to Regex
The use of RegEx allows us to search or manipulate data in many ways in order to get a desired result. Without RegEx, when we do comparisons or replacements we must look for an exact match. Most of the time this is sufficient but what if you need to search or replace based on a pattern? Say you want to search for strings that simply start with a particular word. RegEx uses pattern matching to look at a string with more precision. We can use this to control which claims are passed through, and even manipulate the data inside the claims.
Using RegEx in searches
Using RegEx to pattern match is accomplished by changing the standard double equals “==” to “=~” and by using special metacharacters in the condition statement. I’ll outline the more commonly used ones, but there are good resources available online that go into more detail. For those of you unfamiliar with RegEx, let’s first look at some common RegEx metacharacters used to build pattern templates and what the result would be when using them.
Symbol |
Operation |
Example rule |
^ |
Match the beginning of a string |
c:[type == “http://contoso.com/role”, Value =~ “^director”] |
$ |
Match the end of a string |
c:[type == “http://contoso.com/email”, Value =~ “contoso.com$”] |
| |
OR |
c:[type == “http://contoso.com/role”, Value =~ “^director|^manager”] |
(?i) |
Not case sensitive |
c:[type == “http://contoso.com/role”, Value =~ “(?i)^director”] |
x.*y |
“x” followed by “y” |
c:[type == “http://contoso.com/role”, Value =~ “(?i)Seattle.*Manager”] |
+ |
Match preceding character one or more times |
c:[type == “http://contoso.com/employeeId”, Value =~ “^0+”] |
* |
Match preceding character zero or more times |
Similar to above, more useful in RegExReplace() scenarios. |
Using RegEx in string manipulation
RegEx pattern matching can also be used in replacement scenarios. It is similar to a “find and replace”, but using pattern matching instead of exact values. To use this in a claim rule, we use the RegExReplace() function in the value section of the issuance statement.
The RegExReplace() function accepts three parameters.
- The first is the string in which we are searching. We will typically want to search the value of the incoming claim (c.Value), but this could be a combination of values (c1.Value + c2.Value).
- The second is the RegEx pattern we are searching for in the first parameter.
- The third is the string value that will replace any matches found.
**
Example:**
|
Real World Examples
** **
Problem 1:
We want to add claims for all group memberships, including distribution groups.
Solution:
Typically, group membership is added using the wizard and selecting Token-Groups Unqualified Names and map it to the Group or Role claim. This will only pull security groups, not distribution groups, and will not contain Domain Local groups.
We can pull from memberOf, but that will give us the entire distinguished name, which is not what we want. One way to solve this problem is to use three separate claim rules and use RegExReplace() to remove unwanted data.
Phase 1: Pull memberOf, add to working set “phase 1” |
|
Example: “CN=Group1,OU=Users,DC=contoso,DC=com” is put into a phase 1 claim. |
Phase 2: Drop everything after the first comma, add to working set “phase 2” |
=> add(Type = "http://test.com/phase2", Value = RegExReplace(c.Value, ",[^\n]*", "")); |
Example: We process the value in the phase 1 claim and put “CN=Group1” into a phase 2 claim. |
|
Phase 3: Drop CN= at the beginning, add to outgoing claim set as the standard role claim |
=> issue(Type = "http://schemas.microsoft.com/ws/2008/06/identity/claims/role", Value = RegExReplace(c.Value, "^CN=", "")); |
Example: We process the value in phase 2 claim and put “Group1” into the role claim |
|
Problem 2:
We need to compare the values in two different claims and only allow access to the relying party if they match.
Solution:
In this case we can use RegExReplace(). This is not the typical use of this function, but it works in this scenario. The function will attempt to match the pattern in the first data set with the second data set. If they match, it will issue a new claim with the value of “Yes”. This new claim can then be used to grant access to the relying party. That way, if these values do not match, the user will not have this claim with the value of “Yes”.
c2:[Type == "http://adatum.com/data2"] => issue(Type = "http://adatum.com/UserAuthorized", Value = RegExReplace(c1.Value, c2.Value, "Yes")); |
Example: If there is a data1 claim with the value of “contoso” and a data2 claim with a value of “contoso”, it will issue a UserAuthorized claim with the value of “Yes”. However, if data1 is “adatum” and data2 is “fabrikam”, it will issue a UserAuthorized claim with the value of “adatum”. |
Digging Deeper: RegExReplace(c1.Value, c2.Value, "Yes")
|
Problem 3:
Let’s take a second look at potential issue with our solution to problem 2. Since we are using the value of one of the claims as the RegEx syntax, we must be careful to check for certain RegEx metacharacters that would make the comparison mean something different. The backslash is used in some RegEx metacharacters so any backslashes in the values will throw off the comparison and it will always fail, even if the values match.
Solution:
In order to ensure that our matching claim rule works, we must sanitize the input values by removing any backslashes before doing the comparison. We can do this by taking the data that would go into the initial claims, put it in a holding attribute, and then use RegEx to strip out the backslash. The example below only shows the sanitization of data1, but it would be similar for data2.
Phase 1: Pull attribute1, add to holding attribute “http://adatum.com/data1holder” |
|
Example: The value in attribute 1 is “Contoso\John” which is placed in the data1holder claim. |
Phase 2: Strip the backslash from the holding claim and issue the new data1 claim |
|
Example: We process the value in the data1holder claim and put “ContosoJohn” in a data1 claim |
An alternate solution would be to pad each backslash in the data2 value with a second backslash. That way each backslash would be represented as a literal backslash. We could accomplish this by using RegExReplace(c.Value,”\\”,”\\”) against a data2 input value.
|
Problem 4:
Employee numbers vary in length, but we need to have exactly 9 characters in the claim value. Employee numbers that are shorter than 9 characters should be padded in the front with leading zeros.
Solution:
In this case we can create a buffer claim, join that with the employee number claim, and then use RegEx to use the right most 9 characters of the combined string.
Phase 1: Create a buffer claim to create the zero-padding |
|
Phase 2: Pull the employeeNumber attribute from Active Directory, place it in a holding claim |
=> add(store = "Active Directory", types = ("ENHolder"), query = ";employeeNumber;{0}", param = c.Value); |
Phase 3: Combine the two values, then use RegEx to remove all but the 9 right most characters. |
&& c2:[Type == "ENHolder"] => issue(Type = "http://adatum.com/employeeNumber", Value = RegExReplace(c1.Value + c2.Value, ".*(?=.{9}$)", "")); |
|
Problem 5:
Employee numbers contain leading zeros but we need to remove those before sending them to the relying party.
Solution:
In this case we can pull employee number from Active Directory and place it in a holding claim, then use RegEx to use the strip out any leading zeros.
Phase 1: Pull the employeeNumber attribute from Active Directory, place it in a holding claim |
=> add(store = "Active Directory", types = ("ENHolder"), query = ";employeeNumber;{0}", param = c.Value); |
Phase 2: Take the value in ENHolder and remove any leading zeros. |
=> issue(Type = "http://adatum.com/employeeNumber", Value = RegExReplace(c.Value, "^0*”, "")); |
|