共用方式為


SQL Server Agent failing to start because "The EventLog service has not been started";

(This post has been extended with SQL Server Agent failing to start because "The EventLog service has not been started" (Second Part) )

One customer reported to us that their SQL Server 2000's SQL Server Agent services, recently started to fail when trying to come up running, automatically after a reboot of the operating system hosting the SQL.

When that happened, the following messages were dumped into SQL Server Agent's log (SQLAGENT.OUT):

2007-11-04 20:23:01 - ! [241] Startup error: Unable to initialize error reporting system (reason: The EventLog service has not been started)
2007-11-04 20:23:21 - ? [098] SQLServerAgent terminated (normally)

Customer claimed that even after creating a dependency on EventLog service for SQL Server Agent service, the error continued to show up after every machine reboot.

It happened to be a bug in SQL Server Agent's code, and here's a brief explanation of what it is:

As part of SQL Server Agent's startup tasks, it initializes a Logging component, which will report errors, warnings, or informational messages to agent's log file (SQLAGENT.OUT) and, if it's running on any non Win9x platforms, to the Event Log (by using the EventLog service Windows APIs). So, as part of that initialization process, it has to make sure that the EventLog service is started, because if it isn't, then the agent cannot function properly and so it exits with the error described above.

The function which checks whether the EventLog service is running or not, calls EnumServicesStatus twice with SERVICE_ACTIVE as the value for its dwServiceState parameter (so it will only enumerate services that are in the following states: SERVICE_START_PENDING, SERVICE_STOP_PENDING, SERVICE_RUNNING, SERVICE_CONTINUE_PENDING, SERVICE_PAUSE_PENDING, or SERVICE_PAUSED).

The first call to that API passes NULL to the lpServices parameter and zero for cbBufSize, that to only determine the size of the buffer required to hold the list of services in such states at the moment of the call. Later, it allocates as many bytes as the pcbBytesNeeded parameter reflected that were needed, and finally it calls EnumServicesStatus again. If at the time of the second call to EnumServicesStatus there are more services in any of the states mentioned earlier, than during the first call to EnumServicesStatus, then it returns FALSE and GetLastError returns ERROR_MORE_DATA.

The function implemented in SQL Server Agent to check whether EventLog is running or not, doesn't take into account such "exceptional" condition and simply returns a failure as if the EventLog service wouldn't be running.

A fix would be required that would rewrite the mentioned function, so that it retries the calls to EnumServicesStatus everytime the second call of the loop returns FALSE and GetLastError returns ERROR_MORE_DATA. But since that fix doesn't meet the bar at this point as to be addressed via a hotfix, I better give you a couple of workarounds you could implement in the event you encounter yourself face to face with this problem.

As it was already mentioned earlier, a simple dependency rule on the Service Control Manager doesn't suffice. So, your two options are:

  1. Use LoadOrder by Bryce Cogswell to set the order in which SCM must start your services, and set SQL Server Agent to be the last one in the list. With this method there's still a minimal chance that the issue can be seen.
  2. Use the following script (provided by my colleague Johann "Julien" Weskandt) so that SQL Server tries to start the SQL Agent service after a given delay.

use master
go
create proc spStartAgent as
-- Version 1.0 by JohannWe, EMEA GTSC, Microsoft customer support
set nocount on
declare @inst sysname, @cmd nvarchar(1000), @rc int, @msg nvarchar(1000)
set @msg=null
set @inst=convert(sysname, serverproperty('InstanceName'))
if @inst is null
set @cmd=N'net start SQLServerAgent'
else
set @cmd=N'net start SQLAgent$' + ltrim(rtrim(@inst))
create table #res (id tinyint not null identity, msg nvarchar(1000))
insert #res(msg) exec @rc=master.dbo.xp_cmdshell @cmd
if @rc=0
return 0
else
if @rc=2
begin
select @msg=msg from #res where id=3 and msg like '%NET HELPMSG 2182%'
if @msg is not null -- we get error 2182, if SQLAgent was already started
return 0
else
begin
raiserror ('An error was raised while starting SQLAgent. See System Event log for details!' , 19, @rc) with log
return 1
end
end
go
create proc spStartAgentDelay as
declare @t char(8), @rc int
set @t='00:01:00' -- Edit to change waiting time!
waitfor delay @t
exec @rc=spStartAgent
go
-- Register as startup proc:
declare @rc int
exec @rc=sp_procoption 'spStartAgentDelay', 'startup', 'on'
select ReturnCode=@rc
select IsStartup=objectproperty (object_id('spStartAgentDelay'), 'ExecIsStartup')
go

Stay tuned.

Comments

  • Anonymous
    December 16, 2007
    I have found that SQL Server needs Read permissions on both the 'Event Log' and the 'Remote Procedure Call' services.   In a default Windows 2003 install, SQL Server should have the required permissions to these services.  However, in an environment where all service permissions are locked down by GPO, then SQL will not have access to these services, and BOL does not say that access is required.  If you control service access via GPO, then the relevant SQL service accounts must be given Read access to these services. The impact of not having Read access to Event Log is that SQL Agent will not start. The impact of not having Read access to the RPC service is that no connection can be made to MSDTC.

  • Anonymous
    December 17, 2007
    Hi Ed, Thank you very much for your feedback. I've filed a bug so that the guys in product team review your feedback and update the documentation accordingly. Regarding the Event Log service, without further testing, I would expect it to be sufficient with "Query Service Status" granted. Thanks, Nacho

  • Anonymous
    October 15, 2008
    可能是sqlagent服务在eventlog服务启动之前造成的,在网上找了很久,看到了一个解决方法,可以试一下。 转载地址:http://blogs.msdn.com/ialonso/archiv...

  • Anonymous
    October 23, 2008
    SQL Server Agent が起動に失敗する : 「サービス EventLog は開始されていません」

  • Anonymous
    December 12, 2008
    Thank you very much for this.  Any chance that you could convince the powers-that-be at Microsoft to release a Knowledge Base article about this?

  • Anonymous
    February 13, 2013
    I've implemented the suggested script and am still monitoring for failures. I'm getting a message in my Agent Event Log 1 minute after SQL Server startup (after reboot) as follows: Message [204] Alert 'Demo: Sev. 19 Errors' has occurred This error also shows up in the Windows Application Event Log. Thoughts? Thank you!

  • Anonymous
    February 14, 2013
    Hi Daniel, That message is showing because your SQL Server Agent ErrorLoggingLevel is set so that it reports informational messages, and you have one Alert defined which is triggering. Based on the name of the alert "Demo: Sev. 19 Errors" I'd bet it is a SQL Server event alert, which will trigger whenever it identifies an error whose severity is 19. If you want to avoid that message, either set your Agent's logging level to not include informational messages, or disable/delete that alert if it isn't important to you. Thanks, Nacho

  • Anonymous
    January 08, 2014
    哇,有说中文的,有说英文的,有说日文的。