.NET Case Study: Stackoverflow Exception when using a complex rowfilter
When you use very complex Rowfilters or expressions on datasets or datatables you may end up getting a stackoverflow exception.
Eber was running into this and posted a comment here. Since it is something we see from time to time and it was a bit to long to answer in the comments, here is the why and the how...
Problem description:
When browsing an ASP.NET site, intermittently we get "Internet Explorer cannot display the webpage" accompanied by the following event in the eventlog:
Event Type: Error
Event Source: .NET Runtime 2.0 Error Reporting
Event Category: None
Event ID: 5000
Date: 2008-03-31
Time: 09:59:27
User: N/A
Computer: MYMACHINE
Description:
EventType clr20r3, P1 w3wp.exe, P2 6.0.3790.3959, P3 45d6968e, P4 system.data, P5 2.0.0.0, P6 471ebf27, P7 1840, P8 0, P9 system.stackoverflowexception, P10 NIL.
For more information, see Help and Support Center at https://go.microsoft.com/fwlink/events.asp.
Generating dumps:
Following the instructions in the stackoverflow lab we end up with a log file where the last event/exception before the crash is the c00000fd Stackoverflow exception
Mon Mar 31 10:14:11.557 2008 (GMT+2): (2994.2848): Stack overflow - code c00000fd (first chance)
---
--- 1st chance Stackoverflow exception ----
---------------------------------------------------------------
The next step is to get a memory dump on the actual exception so we can set up a config file similar to the unknown.cfg that we used in the lab, but in this case we will be dumping on the c00000fd exception instead (and save it as SOF.cfg in the debuggers directory)
<adplus>
<settings>
<runmode> CRASH </runmode>
</settings>
<!-- defining and configuring exceptions -->
<exceptions>
<!-- First we redefine all exceptions -->
<config>
<code>AllExceptions</code>
<actions1>Log</actions1>
<actions2>MiniDump;Log;EventLog</actions2>
</config>
<newexception>
<code> c00000fd </code>
<name> Stackoverflow </name>
</newexception>
<!-- Configuring the custom exception -->
<config>
<code> c00000fd </code>
<actions1>FullDump;Log;EventLog</actions1>
<actions2>FullDump;Log;EventLog</actions2>
</config>
</exceptions>
</adplus>
Finally we run this with adplus -pn w3wp.exe -p SOF.cfg which generates a memory dump on the first chance Stackoverflow Exception when we reproduce the issue.
Debugging the issue:
If we open up the dump file in windbg and run kb 2000 and !clrstack to see the native and .net stacks we see the following:
0:016> kb
ChildEBP RetAddr Args to Child
024f37d4 652857b0 00000000 00000200 06f37854 System_Data_ni!_load_config_used+0x114f8b
024f4230 652857b0 00000000 00000200 06f37854 System_Data_ni!_load_config_used+0x1140e0
024f4c8c 652857b0 00000000 00000200 06f37854 System_Data_ni!_load_config_used+0x1140e0
024f56e8 652857b0 00000000 00000200 06f37854 System_Data_ni!_load_config_used+0x1140e0
024f6144 652857b0 00000000 00000200 06f37854 System_Data_ni!_load_config_used+0x1140e0
...
0:016> .loadby sos mscorwks
0:016> !clrstack
OS Thread Id: 0x2848 (16)
ESP EIP
024f2dac 6528665b System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[])
So in the native stack (kb) there is clear evidence of recursion (a method calling itself in a recursive loop) but the symbols are not matching up since this is .net methods.
The !clrstack output on the other hand only shows one frame System.Data.BinaryNode.EvalBinaryOp, so the recursion isn't showing up in there. Sometimes we can get a weird .net stack like this with !clrstack if we are in the middle of handling an exception which is exactly what is happening here.
I have mentioned the !dumpstack command before. This command dumps the raw stack without checking it and doing stackwalks to see if it is valid, so you should use it with caution but in some cases like this it may proove very useful in order to see more of the stack.
0:016> !dumpstack
OS Thread Id: 0x2848 (16)
Current frame: (MethodDesc 0x654910c8 +0x1b System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
ChildEBP RetAddr Caller,Callee
024f37d4 652857b0 (MethodDesc 0x65491070 +0x1c System.Data.BinaryNode.Eval(System.Data.DataRow, System.Data.DataRowVersion)), calling (MethodDesc 0x654910c8 +0 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f37f0 65285a79 (MethodDesc 0x654910b8 +0x11 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f37f8 6528b5b0 (MethodDesc 0x654910c8 +0x4f70 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[])), calling (MethodDesc 0x654910b8 +0 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f4230 652857b0 (MethodDesc 0x65491070 +0x1c System.Data.BinaryNode.Eval(System.Data.DataRow, System.Data.DataRowVersion)), calling (MethodDesc 0x654910c8 +0 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f424c 65285a79 (MethodDesc 0x654910b8 +0x11 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f4254 6528b5b0 (MethodDesc 0x654910c8 +0x4f70 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[])), calling (MethodDesc 0x654910b8 +0 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f4c8c 652857b0 (MethodDesc 0x65491070 +0x1c System.Data.BinaryNode.Eval(System.Data.DataRow, System.Data.DataRowVersion)), calling (MethodDesc 0x654910c8 +0 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f4ca8 65285a79 (MethodDesc 0x654910b8 +0x11 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f4cb0 6528b5b0 (MethodDesc 0x654910c8 +0x4f70 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[])), calling (MethodDesc 0x654910b8 +0 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f56e8 652857b0 (MethodDesc 0x65491070 +0x1c System.Data.BinaryNode.Eval(System.Data.DataRow, System.Data.DataRowVersion)), calling (MethodDesc 0x654910c8 +0 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f5704 65285a79 (MethodDesc 0x654910b8 +0x11 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f570c 6528b5b0 (MethodDesc 0x654910c8 +0x4f70 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[])), calling (MethodDesc 0x654910b8 +0 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f6144 652857b0 (MethodDesc 0x65491070 +0x1c System.Data.BinaryNode.Eval(System.Data.DataRow, System.Data.DataRowVersion)), calling (MethodDesc 0x654910c8 +0 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f6160 65285a79 (MethodDesc 0x654910b8 +0x11 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
024f6168 6528b5b0 (MethodDesc 0x654910c8 +0x4f70 System.Data.BinaryNode.EvalBinaryOp(Int32, System.Data.ExpressionNode, System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[])), calling (MethodDesc 0x654910b8 +0 System.Data.BinaryNode.Eval(System.Data.ExpressionNode, System.Data.DataRow, System.Data.DataRowVersion, Int32[]))
...
0252f26c 6528d658 (MethodDesc 0x654841f8 +0x88 System.Data.DataExpression.Bind(System.Data.DataTable)), calling mscorwks!JIT_Writeable_Thunks_Buf+0x11f
0252f284 6520dbd2 (MethodDesc 0x65482ee0 +0x12 System.Data.DataView.SetIndex(System.String, System.Data.DataViewRowState, System.Data.IFilter)), calling (MethodDesc 0x65482ee8 +0 System.Data.DataView.SetIndex2(System.String, System.Data.DataViewRowState, System.Data.IFilter, Boolean))
0252f298 6520b8e7 (MethodDesc 0x65482d08 +0xb3 System.Data.DataView.set_RowFilter(System.String))<br>0252f2b0 0f0207e3 (MethodDesc 0x2c38dc0 +0x143 DataSetPage.Page_Load(System.Object, System.EventArgs))
0252f2cc 66f12980 (MethodDesc 0x66f1bcd0 +0x10 System.Web.Util.CalliHelper.EventArgFunctionCaller(IntPtr, System.Object, System.Object, System.EventArgs))
0252f2d8 6628efd2 (MethodDesc 0x66474328 +0x22 System.Web.Util.CalliEventHandlerDelegateProxy.Callback(System.Object, System.EventArgs)), calling (MethodDesc 0x66f1bcd0 +0 System.Web.Util.CalliHelper.EventArgFunctionCaller(IntPtr, System.Object, System.Object, System.EventArgs))
...
Here we can see that the recursive loop seems to involve EvalBinaryOp calling Eval which again calls EvalBinaryOp. Note: In the stack it looks like EvalBinaryOp calls Eval which calls Eval which calls EvalBinaryOp. This is not true, instead the calltree actually looks like this
EvalBinaryOp
-> Eval (left)
-> EvalBinaryOp
-> Eval (left) ...
-> Eval (right) ...
-> Eval (right)
-> EvalBinaryOp
-> Eval (left) ...
-> Eval (right) ...
but because !dumpstack shows all the functions on the stack it looks like the above...
From the stack though we can see that what happens here is that DataSetPage.Page_Load sets the rowfilter on a DataView/DataTable, and in doing so it will bind the filter expression and start evaluating it.
If the filter looks like this ID=0 or ID=1 or ID=2 it will do Eval(ID=0) or Eval(ID=1 or ID=2) => Eval(ID=0) or (Eval (ID=1) or Eval(ID=2)) etc... and if this filter or expression is very complex such that there are a lot of operands, the recursion goes very deep. If it is long enough or has enough operands it will be so deep that the stackspace will be exhausted before the whole expression is evaluated.
If we look at !dso to see the objects on the stack we can find the expression to see what it looks like...
0:016> !dso
OS Thread Id: 0x2848 (16)
ESP/REG Object Name
ebx 03035cc0 System.Data.BinaryNode
esi 03035be8 System.Data.BinaryNode
024f37c8 03035cc0 System.Data.BinaryNode
024f37cc 03035be8 System.Data.BinaryNode
024f37e4 06f37854 System.Data.DataRow
024f37e8 03035bd0 System.Data.BinaryNode
024f37ec 03035af8 System.Data.BinaryNode
024f37f0 06eb07c4 System.DBNull
024f3824 06eb07c4 System.DBNull
024f3828 06eb07c4 System.DBNull
024f382c 03035cd8 System.Data.BinaryNode
024f4224 03035db0 System.Data.BinaryNode
024f4228 03035cd8 System.Data.BinaryNode
024f4240 06f37854 System.Data.DataRow
024f4244 03035cc0 System.Data.BinaryNode
024f4248 03035be8 System.Data.BinaryNode
024f424c 06eb07c4 System.DBNull
...
0252f278 0301ad3c System.Data.DataExpression
0252f280 030179d8 System.String ID=0 OR ID=1 OR ID=2 OR ID=3 OR ID=4 OR ID=5 OR ID=6 OR ID=7 OR ID=8 OR ID=9 OR ID=10 OR ID=11 OR ID=12 OR ID=13 OR ID=14 OR ID=15 OR ID=16 OR ID=17 OR ID=18 OR ID=19 OR ID=20 OR ID=21 OR ID=22 OR ID=23 OR ID=24 OR ID=25 OR ID=26 OR ID=27 OR ID=28 OR ID=29 OR ID=30 OR ID=31 OR ID=32 OR ID=33 OR ID=34 OR ID=35 OR ID=36 OR ID=37 OR ID=38 OR ID=39 OR ID=40 OR ID=41 OR ID=42 OR ID=43 OR ID=44 ...
...
0252f55c 06f14538 ASP.datasetpage_aspx
0252f560 06f14538 ASP.datasetpage_aspx
0252f574 02ebcab0 System.AsyncCallback
0252f578 06f14538 ASP.datasetpage_aspx
0252f5b0 06f122ec System.Web.HttpContext
0252f5c8 02ebcab0 System.AsyncCallback
...
We could have dumped out the DataExpression object by doing !do 0301ad3c but it is really not neccessary here since the filter expression is already printed out in the string that precedes the DataExpression (the ID=0 or ID=1 or...)
Resolution and final words:
I could get up to around 75 operands before getting a stackoverflow but mileage may vary as it depends on the size of the objects on the stack and how deep the callchain is before you get to the expression evaluation.
Typically when I get these issues the expression wasn't meant to be that long, so there was some logic issue causing the app to add on more and more filters. If however you need an extremely long expression you may want to spawn off a new thread (with a bigger callstack) to perform the setting of the rowfilter.
Laters,
Tess
Comments
Anonymous
March 31, 2008
thanks a lot Tess, I really appreciate it, I'll try it out today however, I'm curious why this only happens on the servers (2003 SP2 with much more memory and power than the dev laptps) and not on any of the development machines; in the dev laptops it actually runs quite fast, any clues? again, muchas graciasAnonymous
March 31, 2008
aah... this might answer my question http://blogs.msdn.com/tom/archive/2008/03/31/stack-sizes-in-iis-affects-asp-net.aspxAnonymous
March 31, 2008
I just had stack overflow problem a few days ago in my asp.net app. It may have to do with this. Thanks Josh http://riverasp.netAnonymous
March 31, 2008
Eber, yeah, that is probably the reason...Anonymous
April 01, 2008
Web: JSLab Standard Library - another javascript library 5 Firefox Extensions Any Web Developer MustAnonymous
April 01, 2008
Web:JSLabStandardLibrary-anotherjavascriptlibrary5FirefoxExtensionsAnyWebDeveloperMu...Anonymous
February 22, 2009
Debug入门之旅-StackoverFlow exception的调试Anonymous
July 28, 2013
Thank you for giving such a useful information.It has helped me to overcome this problem. transinntech.com