Who is restarting my server?
Hello - This is Omer and I recently came across a case where the customer reported that they could not reboot into safe mode using their custom image. Whenever they booted into safe mode, the machine would get to the logon screen, wait for 5 seconds and then reboot regardless of any user input. Nothing was being logged in the event logs either, so it was very strange.
At first it looked like the machine was going through a power cycle, since the shutdown was so quick (we would not see the usual shutdown messages like “Shutting down Services”, etc.). I thought maybe there was some issue with the hardware, but the customer reported that they had the same issue on every machine, regardless of the hardware vendor.
To figure this out, I attached a kernel debugger to the machine, and broke in to make sure the connection was good. I then let the machine go, and it got to the logon screen. Sure enough, after 5 seconds the machine rebooted. I thought that I would run into some kind of exception, and the debugger would break, however nothing of the sort happened. The only message that I got was that the following
Shutdown occurred at (Fri Jun 26 17:27:12.714 2009 (GMT-7))...unloading all symbol tables.
Very strange! The OS disconnected the debugger gracefully. I did a quick source code review and found that one of the places that we disconnect the debugger was in the system shutdown path. Maybe the OS was shutting down gracefully, but since it happened so fast, it looked like a power cycle. To test my theory, I put a breakpoint on nt!NtShutdownSystem to see if it was being called, and find the caller as well. Rebooted the machine, and let it rip.
nt!NtShutdownSystem()
nt!KiSystemServiceCopyEnd()+0x13
ntdll!ZwShutdownSystem(void)+0xa
services!ScRevertToLastKnownGood()+0x1af
services!ScStartMarkedServices()+0x154
services!ScStartServiceAndDependencies()+0x43d
services!ScAutoStartServices()+0x225
services!SvcctrlMain()+0xa75
services!main()+0x31
services!__mainCRTStartup()+0x13d
kernel32!BaseThreadInitThunk()+0xd
ntdll!RtlUserThreadStart()+0x1d
Voila! Services.exe is shutting down the system. Probably some service is not starting, which is then somehow causing the machine to shutdown. From the stack, I was able to figure out which service was not starting. Based on the service record, it was some third party remote assistance service.
But, how could this non-critical service not starting successfully, cause the Service Control Manager to reboot the machine? And what is that stack frame about reverting to last known good (services!ScRevertToLastKnownGood()+0x1af) doing on the stack?
Looking at the service record, I found that the SCM returned an error code 0x43c. This can be translated to ERROR_NOT_SAFEBOOT_SERVICE(This service cannot be started in Safe Mode) . Also, the ErrorControl value for this service value was set to 0x2, which meant that if the service was not started successfully, the system needs to revert to the last known good configuration and reboot. However if the system was already using last known good, then it should just continue the boot process and log the error.
Error Control Meaning
Level
0x3 (Critical) Fail the attempted system startup.
If the startup is not using the
LastKnownGood control set, switch to
LastKnownGood. If the startup attempt
is using LastKnownGood, run a bug-check
routine.
0x2 (Severe) If the startup is not using the
LastKnownGood control set, switch to
LastKnownGood. If the startup attempt
is using LastKnownGood, continue on
in case of error.
0x1 (Normal) If the driver fails to load or initialize,
startup should proceed, but display a
warning.
0x0 (Ignore) If the driver fails to load or initialize,
start up proceeds. No warning is displayed.
Because the service’s ErrorControl value is set to 0x2, the machine would revert to the last known good configuration and silently reboot. I booted the machine normally, and changed the ErrorControl value in the registry.
I also had to change the value in the other ControlSets, since they were identical to the current control set. This also explains why the machine kept rebooting every time, the value in the Last Known Good Configuration was also set incorrectly. L
I rebooted the machine and was able to boot into safe mode normally. Hence, the mystery of the silent reboots was solved.
Share this post : |
Comments
- Anonymous
January 19, 2012
My windows 2003 server is getting restart and showing below error in Event viewer Pls help me...... The process winlogon.exe has initiated the restart of computer IVRSERVER on behalf of user NT AUTHORITYSYSTEM for following reason: No title for this reason could be found Reason Code: 0x80040001 Shutdown Type: restart [Manish, as shown in this article you can set a breakpoint on NtShutdownSystem to get a better idea of why the system is being shut down.]