Be aware: a new section in my blog
Periodically, I bump into unusual behavior, features, of compiler, operating system, strange design and coding decisions made by someone else that makes me wonder. After giving it some thought, I often come to the conclusion that behavior or observed specifics are justifiable. However if I knew it upfront, It would have helped me to avoid unnecessary mistakes and complexity.
Today I would like to open new section in my blog: Be Aware. In this section I would like to spend some time on this exact subject: strange behaviors and unexpected features. Sometimes these type of things are fascinating, sometimes just interesting and sometimes are just naïve. My goal is to point them out so that when you make your own decisions you will have more information to rely on. My goal is not to uncover bugs or start some kind of religious debates - I really don't like them. I strongly believe that problem drives a solution. What I mean by this is that something outrages in one place will be fully applicable in the other.
So let's begin :-).
The topic of variable's placement have been covered to full extend by now. Some developers prefer to declare all variables upfront some in the scope where variables get actually used. Personally, I have heard cons and pros for both sides and kind of agree with both. I think there is no black and white. In some cases you do have to declare a variable inside of the given scope. Recently I have observed interesting debugger's behavior while debugging dumps. As all of you know debugger usually has difficulties to show local variables for retail, optimized build. This is kind of expected in many cases. Before you can rely on a value, you have to confirm that the value is really correct. In windbg I usually use dv command to output local variables
> dv -V
Using the command with -V lists the actual placement of the variable on the stack. What I noticed is that windbg wasn't displaying variables declared in local scopes at all even if instruction pointer was inside of the scope. It just doesn't list them! So I usually end up chasing those variables by disassembling the code and following stack manipulations. (Knowing original stack assignment in this cases would really helped) The process works 100% of the time, but I would have saved hours if the code declared variables in the beginning of the function. At least the debugger would gave me a clue where the variable resides on the stack.
Probably the point here is that avoiding unnecessary complexity could enable tools to do better job and make everyone life easier. Remember tools, as any software, well tested for the common case. So next time when declaring a variable in an internal scope just stop for a second and think: Is it really necessary?
I have more stories around complexity but I will keep them until next time :-).
Comments
- Anonymous
February 24, 2005
Hi Slava,
First of all, great blog, please keep posting!
Now on the matter at hand – I don’t want to get into the ‘declare locals on top vs. declare them when you need them’, but the point you brought up is not valid – what locals you’ll see in windbg depends on what locals are left on the stack after the optimizations, not on the location they are declared.
Here is an example: (limiting the discussion to windbg and ms c++ compiler)
#include <stdlib.h>
int main()
{
int i = rand();
int j = rand();
{
int k = rand();
int l = rand();
{
int m = rand();
int n = rand();
{
int o = rand();
int p = rand();
{
int q = i + j + k + l + m + n + o + p;
return q;
}
}
}
}
}
Now if you compile this with /Od windbg will follow the scope and display the locals as you would expect. If you compile the same thing with /O2 things are different.
Here is what I’m seeing with my version of the compiler:
0:000> dv -V
0012fee0 @ebp-0x04 m = 1245120
0012fedc @ebp-0x08 n = 29694656
0012fed8 @ebp-0x0c o = 2147348480
PS: even funnier in this case is that the generated code actually uses ebp for one of the locals (‘l’ I think), so even if m, n, o are on the stack the debugger is unable to show them:
0:000> dv -V
00006780 @ebp-0x04 m = <Memory access error>
0000677c @ebp-0x08 n = <Memory access error>
00006778 @ebp-0x0c o = <Memory access error> - Anonymous
February 24, 2005
Lemo, you are exactly correct. Yes, it definitely depends on the optimizations compiler decides to make. For example if local variable gets used in the first part of the function and never gets touched latter, compiler can use register and never allocate memory for it on the stack. Another example of common optimizations, that I have seen multiple times, the compiler would use the same stack location for different variables.
From my personal experience, especially with function containing lots of locals compiler would usually optimize out locals in the inner scope rather than in outter. I guess this happens due to isolated use of inner variables where compiler could make better decisions. However I am not a compiler guru this is only speculations at this point.
Your example is interesting, compiler actually decided to use registers ebx and edi for i and j. I think it is mostly due to simplicity of the function.
0:000> uf main
test!main [e:yukonmainlinesqlntdbmsdbgtoolstestrepro2.cpp @ 5]:
5 01001180 55 push ebp
5 01001181 8bec mov ebp,esp
5 01001183 83ec14 sub esp,0x14
5 01001186 53 push ebx
5 01001187 56 push esi
6 01001188 8b3500100001 mov esi,[test!_imp__rand (01001000)]
6 0100118e 57 push edi
6 0100118f ffd6 call esi
6 01001191 8bf8 mov edi,eax <==== this is i
7 01001193 ffd6 call esi
7 01001195 8bd8 mov ebx,eax <==== this is j
10 01001197 ffd6 call esi
10 01001199 8945ec mov [ebp-0x14],eax
11 0100119c ffd6 call esi
11 0100119e 8945f0 mov [ebp-0x10],eax
14 010011a1 ffd6 call esi
14 010011a3 8945f4 mov [ebp-0xc],eax
15 010011a6 ffd6 call esi
15 010011a8 8945f8 mov [ebp-0x8],eax
18 010011ab ffd6 call esi
18 010011ad 8945fc mov [ebp-0x4],eax
19 010011b0 ffd6 call esi
19 010011b2 0345fc add eax,[ebp-0x4]
23 010011b5 0345f8 add eax,[ebp-0x8]
23 010011b8 0345f4 add eax,[ebp-0xc]
23 010011bb 0345f0 add eax,[ebp-0x10]
23 010011be 0345ec add eax,[ebp-0x14]
23 010011c1 03c3 add eax,ebx
23 010011c3 03c7 add eax,edi
23 010011c5 5f pop edi
23 010011c6 5e pop esi
23 010011c7 5b pop ebx
30 010011c8 8be5 mov esp,ebp
30 010011ca 5d pop ebp
30 010011cb c3 ret
In my case, I used /Ox, I could actually see the variables in the debugger
0:000> dv
l = 26500
k = 6334
m = 19169
n = 15724
o = 11478
0:000> dv -V
000aff74 @ebp-0x10 l = 26500
000aff70 @ebp-0x14 k = 6334
000aff78 @ebp-0x0c m = 19169
000aff7c @ebp-0x08 n = 15724
000aff80 @ebp-0x04 o = 11478