다음을 통해 공유


Every little bit counts.

It's been a while since I posted.. and I thought this was kinda interesting.

Here was an odd one. Whenever customer did  "x" it crashed his machine.
Examining his dump I see that we crash here:

(154.16d0): Access violation - code c0000005 (!!! second chance !!!)
eax=00000000 ebx=0007def4 ecx=00000004 edx=00000010 esi=0000022e edi=0007df0c
eip=72636282 esp=0007de58 ebp=7267de68 iopl=0         nv up ei pl nz ac pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010212
suchandsuch!finefunction+0x12a:
72636282  f3a5            rep  movsd ds:0000022e=???????? es:0007df0c=00000000

I have edited some addresses and function just because I am paranoid about what I can and cant post here ( not that its all super secret, there are public symbols after all ) Anywho...on we go.

Turns out we passed a param all the way up thru 5 or 6 functions but the original was incorrect.
It was pushing a bad value on via EBX here:

6675 6ee5d46 8d9a04020000     lea     ebx,[edx+0x204]

Studying the surrounding assembly edx was not even related to what we ought to have passed.
After banging my head as to why in the world it would pass this, I unassembled a test machine I have:
 
(bad)   6675 6ee5d46 8d9a04020000     lea     ebx,[edx+0x204]    -- 22e
(good)  6675 6ee5d46 8d9e04020000     lea     ebx,[esi+0x204]

This is odd.. I have the exact same binary on my machine:

    Timestamp:        Thu Mar 24 18:30:34 2005 (424377CA) - mine
    Timestamp:        Thu Mar 24 18:30:34 2005 (424377CA) – his

8d9a04020000    
8d9e04020000    

100011011001101000000100000000100000000000000000
100011011001111000000100000000100000000000000000

We were one bit off.

Names, timestamps, versions all matched what ought to be in this binary – except this single bit in it didn’t match.
At first I thought that perhaps this was some bad hardware  munging the data (or something wonky like that ) , so I requested another dump. It was crashing in the exact same spot – for the same reason.

I requested the customer send me his binary and I hashed a known good and this customers binary:

GetHash.exe /f:bad.dll /h:cat
FA20C7B5B90689123BE5C67EDD86B0E07BB8941F (bad)

GetHash.exe /f:bad.dll /h:cat
385A5FE6FA19EB7EACE3EFB08DF0B3835D1C9B88  (good)

Obviously something was wrong here – I had the customer reapply the version from a fix and everything  cleared up after that.

Odd -  still haven't decided on how this may have  happened. Any ideas? Malware?

Comments

  • Anonymous
    September 17, 2005
    Cosmic rays (various heavy particles from space) can actually flip bits. I read that on average, 1 bit per year is randomly flipped in your average consumer quality computer. No idea if that estimate is accurate, though.
  • Anonymous
    September 17, 2005
    I've heard that as well - but I dont think the HW was to blame here. Seeing as how we had multiple dumps with the same crash data and the file itself seemed altered.
  • Anonymous
    September 17, 2005
    Something much like this happened to me. One of my machines suddenly could not compile our source code. It turned out that in one of the MFC header files a single letter had spontaneously changed from capital to lowercase, a flip of one bit. It's surprising considering the error-checking that media use, but I guess every so often something slips by.
  • Anonymous
    October 22, 2005
    I tend to agree with you Luke...
  • Anonymous
    June 21, 2006
    Another odd one…
     
    More than a bit this time…
     
    Here is something to think about – and...