String.Split has High Cost on Performance

One application had a memory issue, it just consume too much resources. The application performed well, but it is very obvious when we saw the performance counters, something can be done to make it better. Reviewing the code did not reveal something significant, the code already allocated objects that are necessary. The code parses a text file and generate an object based on the content of that text file.

After running CLR Profiler, I noticed that a lot of memory are consumed by int[], but there is no code that generates that kind of object. Digging a little bit deeper, I discovered that one method calls String.Split(). This method is one of the main method, it is being called about 80% of the time.

String.Split() creates an array of strings, and in the process, it creates a helper array of int. The size of the helper array can be really huge. After the code had been modified to avoid calling String.Split(), the memory consumption was significantly reduced.

If you need performance in a tight loop, avoid String.Split().

If you are using Visual Studio 2008, I strongly recommend for you to step through the .Net framework source code using debugger. Check Shawn Burke's post to do that.

Comments

  • Anonymous
    September 10, 2010
    Have you found an alternative? Is there a significant delta between Split(char), Split(char[]) and Split(String[])?

  • Anonymous
    September 10, 2010
    I found a couple alternatives in the Performance Considerations section here: msdn.microsoft.com/.../tabh47cf.aspx (Since Microsoft has a habit of rearranging their website in link-demolishing ways, it's the MSDN documentation for: String.Split Method (String[], StringSplitOptions)