Udostępnij za pośrednictwem


Out of Memory Issues To Watch Out For When Using Regular Expressions.

Hi Guys! It has been a while since I wrote my last post. I had been very busy last month with all the cases flowing in. Recently I worked on a very tricky issue related to OutOfMemoryException. This blog is about that case.

 

How to reproduce OutOfMemoryException in this scenario:

 

Create a managed application in Visual Studio 2008 which reads a large input text file into a string. Then use many instances of System.Text.RegularExpressions.Regex to replace parts of this string one after another. Run this application in a 32 bit or 64 bit machine.

Example code snippet (in C#):
//Read the content of file Test.txt into string abc.
// Let us assume the file test.txt is huge. (around 20 MB)
string path = @"C:\Test.txt";

          StreamReader sr = File.OpenText(path); 

          string strFileContents = sr.ReadToEnd();

          string abc = strFileContents;

          //Apply Regex objects to replace parts of the huge string

          Regex rx1 = new Regex("xyz"); if (rx1.IsMatch(abc)) { abc = rx1.Replace(abc, "XYZ"); }          

          Regex rx2 = new Regex("xyz1"); if (rx2.IsMatch(abc)) { abc = rx2.Replace(abc, "XYZ1"); }

          Regex rx3 = new Regex("xyz2"); if (rx3.IsMatch(abc)) { abc = rx3.Replace(abc, "XYZ2"); }

          ................................

          ................................

          ................................

Run the app and.......

This may throw System.OutOfMemoryException. The Exception details may be the following:

System.OutOfMemoryException was unhandled 

  Message="Exception of type 'System.OutOfMemoryException' was thrown." 

  Source="mscorlib" 

  StackTrace:

       at System.String.GetStringForStringBuilder(String value, Int32 startIndex, Int32 length, Int32 capacity)

       at System.Text.StringBuilder.GetNewString(String currentString, Int32 requiredLength)

       at System.Text.StringBuilder.Append(String value, Int32 startIndex, Int32 count)

       at System.Text.RegularExpressions.RegexReplacement.Replace(Regex regex, String input, Int32 count, Int32 startat)

       at System.Text.RegularExpressions.Regex.Replace(String input, String replacement, Int32 count, Int32 startat)

       at System.Text.RegularExpressions.Regex.Replace(String input, String replacement) 

 

Root Cause:

There are two possible root causes of OutOfMemoryException here:
1) The Garbage collector may not be collecting the Regex objects because they have not left scope.

2) This issue may happen because of the huge size of the string. The Regex.Replace method creates a second string for the replace operation. So at that time a second copy of the really large string will be created, which may cause an OutOfMemoryException.

How to fix it:

1)    We should always assume that objects will not be available for Garbage Collection until the reference or root has left scope, regardless of whether it will ever be used again. The runtime cannot be relied upon to always know when an object will not be used again. We have to rely on the language rules with respect to variable scoping, or set objects to null explicitly when they will not be used again.

Example code snippet (in C#): 

          //Set the Regex objects to null explicitly once they will not be used in the code.  

          Regex rx1 = new Regex("xyz"); if (rx1.IsMatch(abc)) { abc = rx1.Replace(abc, "XYZ"); }

          rx1 = null;

          Regex rx2 = new Regex("xyz1"); if (rx2.IsMatch(abc)) { abc = rx2.Replace(abc, "XYZ1"); }

          rx2 = null;

          Regex rx3 = new Regex("xyz2"); if (rx3.IsMatch(abc)) { abc = rx3.Replace(abc, "XYZ2"); }

          rx3 = null;   

          ................................

          ................................

          ................................

2) "Chunk-up the data” such that string sizes don’t get too big.

Check these too:

1) How to match a pattern by using regular expressions and Visual C#:https://support.microsoft.com/kb/308252

2) Regex class: https://msdn2.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx

3) You may like this thread going on which talks about RegEx performance: https://channel9.msdn.com/ShowPost.aspx?PostID=348549

4) Regex Class represent an immutable regular expression: https://lab.msdn.microsoft.com/restapistubs/content/6f7hht7k/en-us;vs.71/primary/mtps.failsafe

5) RegEx static methods vs. instance methods: https://forums.sqladvice.com/post.aspx?id=2248

Comments

  • Anonymous
    March 22, 2008
    .NET:Omea-jetbrainsOmeasourcecodeASP.NETBasics:FoundationofASP.NETAFewNotesAboutT...

  • Anonymous
    March 22, 2008
    .NET: Omea - jetbrains Omea source code ASP.NET Basics: Foundation of ASP.NET A Few Notes About The MVC

  • Anonymous
    January 08, 2013
    This means that if you are looking for a diamond Bezel in stainless steel models stainless steel Rolex watches with a diamond Bezel are readily available.thanks.