Every little bit counts.
It's been a while since I posted.. and I thought this was kinda interesting.
Here was an odd one. Whenever customer did "x" it crashed his machine.
Examining his dump I see that we crash here:
(154.16d0): Access violation - code c0000005 (!!! second chance !!!)
eax=00000000 ebx=0007def4 ecx=00000004 edx=00000010 esi=0000022e edi=0007df0c
eip=72636282 esp=0007de58 ebp=7267de68 iopl=0 nv up ei pl nz ac pe nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010212
suchandsuch!finefunction+0x12a:
72636282 f3a5 rep movsd ds:0000022e=???????? es:0007df0c=00000000
I have edited some addresses and function just because I am paranoid about what I can and cant post here ( not that its all super secret, there are public symbols after all ) Anywho...on we go.
Turns out we passed a param all the way up thru 5 or 6 functions but the original was incorrect.
It was pushing a bad value on via EBX here:
6675 6ee5d46 8d9a04020000 lea ebx,[edx+0x204]
Studying the surrounding assembly edx was not even related to what we ought to have passed.
After banging my head as to why in the world it would pass this, I unassembled a test machine I have:
(bad) 6675 6ee5d46 8d9a04020000 lea ebx,[edx+0x204] -- 22e
(good) 6675 6ee5d46 8d9e04020000 lea ebx,[esi+0x204]
This is odd.. I have the exact same binary on my machine:
Timestamp: Thu Mar 24 18:30:34 2005 (424377CA) - mine
Timestamp: Thu Mar 24 18:30:34 2005 (424377CA) – his
8d9a04020000
8d9e04020000
100011011001101000000100000000100000000000000000
100011011001111000000100000000100000000000000000
We were one bit off.
Names, timestamps, versions all matched what ought to be in this binary – except this single bit in it didn’t match.
At first I thought that perhaps this was some bad hardware munging the data (or something wonky like that ) , so I requested another dump. It was crashing in the exact same spot – for the same reason.
I requested the customer send me his binary and I hashed a known good and this customers binary:
GetHash.exe /f:bad.dll /h:cat
FA20C7B5B90689123BE5C67EDD86B0E07BB8941F (bad)
GetHash.exe /f:bad.dll /h:cat
385A5FE6FA19EB7EACE3EFB08DF0B3835D1C9B88 (good)
Obviously something was wrong here – I had the customer reapply the version from a fix and everything cleared up after that.
Odd - still haven't decided on how this may have happened. Any ideas? Malware?
Comments
- Anonymous
September 17, 2005
Cosmic rays (various heavy particles from space) can actually flip bits. I read that on average, 1 bit per year is randomly flipped in your average consumer quality computer. No idea if that estimate is accurate, though. - Anonymous
September 17, 2005
I've heard that as well - but I dont think the HW was to blame here. Seeing as how we had multiple dumps with the same crash data and the file itself seemed altered. - Anonymous
September 17, 2005
Something much like this happened to me. One of my machines suddenly could not compile our source code. It turned out that in one of the MFC header files a single letter had spontaneously changed from capital to lowercase, a flip of one bit. It's surprising considering the error-checking that media use, but I guess every so often something slips by. - Anonymous
October 22, 2005
I tend to agree with you Luke... - Anonymous
June 21, 2006
Another odd one…
 
More than a bit this time…
 
Here is something to think about – and...