.NET Regular Expressions: how to use RegexOptions.IgnorePatternWhitspace [Ryan Byington]
The IgnorePatternWhitespace option tells the Regex parser to ignore any spaces or tabs in your expression except if it is in a character class(ie [ ]). At first this may not seem all that useful but it really can increase the readability of a regular expression. Plus you can add comments to your expression.
For example a customer complained that the following regular expression that is suppose to match email addresses was taking to long:
"^([0-9a-zA-Z]([-.\w]*[0-9a-zA-Z])*@([0-9a-zA-Z][-\w]*[0-9a-zA-Z]\.)+[a-zA-Z]{2,9})$"
Maybe I am just not very good at visualizing regular expressions but I have trouble seeing what this regular expression does when it is in this format. The first thing I do is convert it to a format that I can read like the following:
@"
^
(
[0-9a-zA-Z] #Verify the email address starts with a valid character
(
[-.\w]*
[0-9a-zA-Z]
)*
@
(
[0-9a-zA-Z]
[-\w]*
[0-9a-zA-Z]
\.
)+
[a-zA-Z]{2,9} #Match to com, org, uk, etc
)
$"
This equivalent to the first expression as long as the RegexOptions.IgnorePatternWhitespace option is used. The only trick here is that to match a space you must either use [ ] or \x20. If you are curious what was wrong with this expression the ‘.’ in [-.\w]* needed to be escaped.
It is possible to write an entire c# program on a single line with no indenting but no one in their right mind would do this because it would be completely unreadable. So why do this with your regular expressions?
Comments
- Anonymous
March 17, 2005
Tips - Anonymous
March 17, 2005
The comment has been removed - Anonymous
March 21, 2005
Tips - Anonymous
March 22, 2005
All right... I will! I'm continually amazed how useful regular expressions are in my daily coding. I'm still working on the MhtBuilder refactoring, and I needed a function to convert all URLs in a HTML string from relative to...