What's wrong with this code, part 12 - Retro Bad Code
I was at a security tech talk last week discussing some fascinating stuff, and it reminded me of an interview question that my manager used to give to people who said that they understood x86 Assembly language.
I realized that it would make an interesting "retro" "what's wrong with this code" so, here goes.
When you're writing code for low level platforms (REALLY low level platforms), one of the first things you need to start handling is processor interrupts. On the x86 family of processors, when an interrupt is generated (either hardware or software), the processor generates a trap frame with the following information (I may have the CS and IP backwards, check your processor manual):
Your code is now executing, but you're running on the application's stack. So the very first thing that has to happen in your interrupt handler is that you've got to switch to your own stack (you need to do this because you don't know how much memory is remaining on the user stack - if you overflow the user's stack, you've got problems.
So you'd have to write code that switches from the user's stack to your kernel mode stack.
My boss used to use an example that was the system call dispatcher for a theoretical operating system, which maintained a 32bit pointer to the kernel mode stack in a global variable, I'll continue that tradition.
<code to establish DS> MOV [Saved_SS], SS MOV [Saved_SP], SP MOV SP, WORD PTR [Dos_Stack] MOV SS, WORD PTR [Dos_Stack]+2 <code to dispatch off of the value in AH>
The problem is that there's a MASSIVE bug in this code. And it's not because of the use of global variables for the saved SS and SP - assume that the reentrancy issues are handled in the <code to establish DS> section above.
Comments
- Anonymous
May 13, 2005
A nice easy one then.
Setting SP before SS, if an interrupt happens at this point then SS:SP points into a an effectively random memory and may well clobber the calling processes stack or data segment.
The sequence should always be SS then SP which disables interrupts until after the next instruction. - Anonymous
May 13, 2005
Thinking some on this reminds me of when writing some BCD divide and multiply code I snaffled the SS register to allow me to use [di+bp] but have it point to the data segment without having the overhead of the segment override prefixes.
The code turned off interrupts while I did this and was hand tuned for efficency on a traditional 8086 and is the only code I've ever used xlat in. (which meant bx wasn't available) - Anonymous
May 13, 2005
it overwrites the stack at Dos_Stack. - Anonymous
May 13, 2005
Shouldn't BP also be set to something? - Anonymous
May 13, 2005
As far as I remember (it was a long time ago :) ) if you use SP or BP the default addressing will be SS:SP or SS:BP. So, if I’m right, you will store the stack pointer in the application stack :)
And, of course, the correct way would be to load SS, and only then SP. - Anonymous
May 13, 2005
No, I am wrong. We do not use SP in addressing here... Should be more attentive... - Anonymous
May 13, 2005
Well, what about 64bit architectures? If we use WORD PTR to get the value from the memory we will end up with only half of the register’s value. - Anonymous
May 13, 2005
Since this is 16 bit code, I'm not worried about what happens when it's run on a 64bit processor... - Anonymous
May 13, 2005
I think the problem will show up if another interrupt (one with a higher priority) happens between the last two MOV instructions like so:
MOV SP, WORD PTR [Dos_Stack]
<interrupt here>
MOV SS, WORD PTR [Dos_Stack]+2
In that case the stack would get corrupted.
Cannot wait for the answers ... :) - Anonymous
May 13, 2005
The comment has been removed - Anonymous
May 13, 2005
The comment has been removed - Anonymous
May 14, 2005
I agree with two of the other posters. Setting SP and SS at different times will cause all sorts of problems. Interupts have a magical habit of finding the worst place to happen. - Anonymous
May 14, 2005
Centaur, that doesn't work. Edge got it right.
The problem is that, even if you use CLI to disable interrupts, there's still the NMI. That interrupt can still happen even when interrupts are disabled.
Using "LSS SP,..." loads both: SS and SP in one instruction, so the NMI interrupt will happen before either is set, or after both are set. It's the whole purpose for the existence of the LSS instruction. - Anonymous
May 14, 2005
The comment has been removed - Anonymous
May 14, 2005
Then again, Larry did mention that reentrancy was somehow handled already. Ok, I shut up now. - Anonymous
May 14, 2005
You have to do the MOV SS before the MOV SP. Otherwise, you can get an interrupt between the MOV SP and the MOV SS, and then you're running on a stack in a random location in memory. Doing the MOV SS first prevents this, because the processor suppresses interrupts for one instruction after a MOV SS just for this reason.
Everybody knows that, right? Or am I showing my age... - Anonymous
May 14, 2005
If it sounds like I didn't notice all the previous comments, that's because I didn't, until afterward... :-) - Anonymous
May 14, 2005
I agree, MOV SP before MOV SS is definitely a bug (long time ago I made the same error myself) but what about the interrupt flag?
I seem to remember interrupts are disabled in an interrupt handler (unless an STI has been issued of course).
So the question of the correct MOV SP/SS order might be moot... - Anonymous
May 14, 2005
@ JCAB:
I don't think that's an issue (the re-entrancy ideas you presented). After SS:(E)SP have been set, any future interrupt would use this newly created stack (assuming the first thing it did wasn't to move to it's own stack as well). You'll note that Larry's interrupt code saves the value of SS:(E)SP and then presumably restores those registers before exiting. If another interrupt took place after you'd set SS:(E)SP that code would likely save your changes and restore them before returning back to your code. (If it didn't, I'd consider that code to be buggy and outside the scope of this discussion).
Make sense?
As far as the issue of setting SS then (E)SP blocking interrupts, the bit I quoted above is from Intel's Pentium 4 reference manual (and that passage has remained the same for as far back as the 286 I think if not earlier). I do agree that LSS is probably a much better solution for modern code though. - Anonymous
May 15, 2005
I agree. From the Intel 80386 programmer manual :
9.2.4 MOV or POP to SS Masks Some Interrupts and Exceptions
Software that needs to change stack segments often uses a pair of
instructions; for example:
MOV SS, AX
MOV ESP, StackTop
If an interrupt or exception is processed after SS has been changed but
before ESP has received the corresponding change, the two parts of the stack
pointer SS:ESP are inconsistent for the duration of the interrupt handler or
exception handler.
To prevent this situation, the 80386, after both a MOV to SS and a POP to
SS instruction, inhibits NMI, INTR, debug exceptions, and single-step traps
at the instruction boundary following the instruction that changes SS. Some
exceptions may still occur; namely, page fault and general protection fault.
Always use the 80386 LSS instruction, and the problem will not occur. - Anonymous
May 16, 2005
MOVE SS,... only disables interrupts [just for one single instruction after it] on modern processors. Certainly it didn't on the 8088/8086. The change came in either with the 186 or the 286 - not sure which. - Anonymous
May 16, 2005
Larry wrote:
" Since this is 16 bit code, I'm not worried about what happens when it's run on a 64bit processor..."
This reminded me of a movie.
"Be afraid. Be very afraid..."
If somehow any of my 16-bit x86 code managed to run on any 64-bit CPU, I'd think there are far more serious matters to worry about. F.ex. the fact THAT it did run. :-) - Anonymous
May 16, 2005
In the last article, I looked at a prototype code snippet to enter a system call.
But the code had a...