排查 Operations Manager 中的 SQL Server Service Broker 问题
Microsoft System Center Operations Manager 依赖于 SQL Server Service Broker 来实现所有任务操作。 如果禁用 SQL Server Service Broker,则所有任务操作都会受到影响。 生成的行为可能因启动的任务而异。 例如,可能会遇到以下问题:
- 尽管后台的任务已完成,但发现向导似乎正在无休止地运行。
- 即使后台的任务已完成,重置监视器的运行状况也从未完成。
本文提供了 SQL Server Service Broker 问题的常见故障排除步骤。
注意
本文中的 SQL 查询使用操作数据库的默认名称 OperationsManager
。 如果使用其他数据库名称,请替换为 OperationsManager
操作数据库的名称。
检查 SQL Server Service Broker 是否已启用
运行以下 SQL 查询:
SELECT is_broker_enabled FROM sys.databases WHERE name = 'OperationsManager'
如果字段的返回值为
is_broker_enabled
1(一),则启用 SQL Server Service Broker。 否则,请运行以下 SQL 查询以启用它:ALTER DATABASE OperationsManager SET SINGLE_USER WITH ROLLBACK IMMEDIATE ALTER DATABASE OperationsManager SET ENABLE_BROKER ALTER DATABASE OperationsManager SET MULTI_USER
重启 System Center Data Access Service
启用 SQL Server Service Broker 后,重启 System Center Data Access Service (OMSDK)。
在 SQL Server Management Studio 中,转到 Databases>OperationsManager>Service Broker。
展开队列和服务。
验证是否有一个队列和服务的名称包含以下值:
- 创建队列和服务的管理服务器的 IP 地址。
- 在该管理服务器上运行的 OMSDK 服务(Microsoft.Mom.Sdk.ServiceHost.exe)的进程 ID。
在此示例中,管理服务器的 IP 地址为 192.168.10.10。 OMSDK 服务的 PID 为 3092。
如果有多个管理服务器,则每个管理服务器都将具有单独的 Service Broker 队列和服务。
如果找不到相应的队列和服务,请再次重启 OMSDK 服务。
如果仍然找不到队列和服务,则当前 Service Broker 可能已损坏。 转到下一步以重新创建 SQL Server Service Broker。
重新创建 SQL Server Service Broker
按顺序运行以下 SQL 查询:
declare @i int set @i=0 DECLARE @handle UNIQUEIDENTIFIER declare @service_id int declare @service_name nvarchar (100) declare @far_service nvarchar (70) DECLARE conv_cur CURSOR FAST_FORWARD FOR SELECT CONVERSATION_HANDLE, service_id, far_service FROM SYS.CONVERSATION_ENDPOINTS OPEN conv_cur; FETCH NEXT FROM conv_cur INTO @handle, @service_id, @far_service; while (@@FETCH_STATUS = 0 and (@i<500000)) BEGIN select top 1 @service_name=name from sys.services where service_id=@service_id begin END CONVERSATION @handle WITH CLEANUP end FETCH NEXT FROM conv_cur INTO @handle, @service_id, @far_service; set @i=@i+1 END CLOSE conv_cur DEALLOCATE conv_cur go declare @servicename sysname declare @queuename sysname declare @cmd nvarchar(200) declare @serverqid nvarchar(25) set @serverqid='%Queue_mid%' while ((select count(*) from sys.service_queues WHERE name like @serverqid )>1) begin set @servicename= (select top 1 s.name from sys.service_queues as q join sys.services as s on q.object_id=s.service_queue_id WHERE q.name like @serverqid order by q.create_date asc) set @cmd= 'DROP SERVICE '+@servicename exec sp_sqlexec @cmd set @queuename= (select top 1 Object_name(object_id) from sys.service_queues WHERE name like @serverqid order by create_date asc) set @cmd= 'DROP QUEUE '+@queuename exec sp_sqlexec @cmd end go ALTER DATABASE OperationsManager SET SINGLE_USER WITH ROLLBACK IMMEDIATE ALTER DATABASE OperationsManager SET NEW_BROKER WITH ROLLBACK IMMEDIATE ALTER DATABASE OperationsManager SET MULTI_USER go ALTER DATABASE OperationsManager SET SINGLE_USER WITH ROLLBACK IMMEDIATE ALTER DATABASE OperationsManager SET ENABLE_BROKER ALTER DATABASE OperationsManager SET MULTI_USER go
在管理服务器上重启 OMSDK 服务以重新创建 Service Broker 以及队列和服务。 此过程可能需要两次重启:
- 第一次重启会重新创建 Service Broker
- 第二次重启会重新创建服务队列
验证 SQL Server Service Broker 是否仍在启用。 如果已禁用,请启用它。
验证是否已生成 Service Broker 队列和服务,如重启 System Center Data Access Service 中的步骤 4 中所述。
高级故障排除
如果前面的步骤无法解决问题,请收集包含 Service Broker 事件的 SQL Server Profiler 跟踪。
OMSDK 服务在重新启动时创建服务和队列时的示例跟踪:
重置监视器运行状况时的示例跟踪:
禁用 Service Broker 时的示例跟踪:
此外,针对操作数据库运行以下 SQL 脚本以收集诊断日志。
USE master
go
SET NOCOUNT ON
SET QUOTED_IDENTIFIER ON;
DECLARE @StartTime datetime
select @@version as 'Version'
select GETDATE() as 'RunDateTime', GETUTCDATE() as 'RunUTCDateTime', SYSDATETIMEOFFSET() as 'SysDateTimeOffset'
select @@servername as 'ServerName'
PRINT '-- sys.databases --'
select * from master.sys.databases where is_broker_enabled = 1 and name not in('tempdb', 'model', 'AdventureWorks', 'AdventureWorksDW')
PRINT ''
PRINT '-- sys.dm_broker_activated_tasks --'
select * from sys.dm_broker_activated_tasks
PRINT ''
PRINT '-- sys.dm_broker_connections --'
select * from sys.dm_broker_connections
PRINT ''
PRINT '-- COUNT Broker Connections --'
SELECT count(*) as Cnt, state_desc, login_state_desc from sys.dm_broker_connections GROUP BY state_desc, login_state_desc ORDER BY state_desc
PRINT ''
PRINT '-- sys.dm_broker_forwarded_messages --'
select * from sys.dm_broker_forwarded_messages
PRINT ''
PRINT '-- sys.service_broker_endpoints --'
select * from sys.service_broker_endpoints
PRINT ''
PRINT '-- sys.tcp_endpoints --'
select * from sys.tcp_endpoints
PRINT ''
PRINT '-- sys.certificates --'
select * from sys.certificates
PRINT ''
PRINT '-- sys.database_mirroring --'
select * from sys.database_mirroring where mirroring_guid is not null
PRINT ''
PRINT '-- sys.dm_db_mirroring_connections --'
select * from sys.dm_db_mirroring_connections
PRINT ''
PRINT '-- sys.dm_os_memory_clerks (broker) --'
select * from sys.dm_os_memory_clerks where type like '%BROKER%' order by type desc
-- Loop Through DBs and Gather SSB information specific to each DB
DECLARE tnames_cursor CURSOR
FOR SELECT name
FROM master.sys.databases
WHERE is_broker_enabled = 1
and state = 0
and name not in('tempdb', 'model', 'AdventureWorks', 'AdventureWorksDW')
ORDER BY [name]
OPEN tnames_cursor;
DECLARE @dbname sysname;
DECLARE @SCI int; -- Checking for Broker activity
DECLARE @cmd3 nvarchar(1024); -- New Command
FETCH NEXT FROM tnames_cursor INTO @dbname;
WHILE (@@FETCH_STATUS = 0)
BEGIN
SELECT @SCI = 0; -- service_contract_id
select @dbname = RTRIM(@dbname);
EXEC ('USE [' + @dbname + ']');
SELECT @cmd3 = N'SELECT @SCI_OUT = MAX(service_contract_id) FROM ' + @dbname + '.sys.service_contracts';
EXEC sp_executesql @cmd3, N'@SCI_OUT INT OUTPUT', @SCI_OUT = @SCI OUTPUT;
IF @SCI > 7
BEGIN
PRINT ''
PRINT '====================================================================================='
PRINT 'Begin Database: ' + @dbname
SELECT @StartTime = GETDATE()
PRINT 'Start Time : ' + CONVERT(Varchar(50), @StartTime)
PRINT ''
PRINT '-- sys.service_message_types --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.service_message_types');
-- PRINT ''
-- PRINT '-- sys.service_contract_message_usages --'
-- EXEC ('SELECT * FROM ' + @dbname + '.sys.service_contract_message_usages');
PRINT ''
PRINT '-- sys.service_contracts --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.service_contracts');
-- PRINT ''
-- print '-- sys.service_contract_usages --'
-- EXEC ('SELECT * FROM ' + @dbname + '.sys.service_contract_usages');
PRINT ''
PRINT '-- sys.service_queues --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.service_queues');
-- PRINT ''
-- PRINT '-- sys.service_queue_usages --'
-- EXEC ('SELECT * FROM ' + @dbname + '.sys.service_queue_usages');
PRINT ''
PRINT '-- sys.services --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.services');
PRINT ''
PRINT '-- sys.routes --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.routes');
PRINT ''
PRINT '-- sys.remote_service_bindings --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.remote_service_bindings');
PRINT ''
PRINT '-- sys.certificates --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.certificates');
PRINT ''
PRINT '-- sys.dm_qn_subscriptions --'
EXEC ('SELECT * FROM ' + @dbname + '.sys.dm_qn_subscriptions');
PRINT '-- sys.dm_broker_queue_monitors, current state, last activation, current backlog in transmission queue --'
EXEC ('USE ' + @dbname + ';SELECT t1.name AS [Service_Name], t3.name AS [Schema_Name], t2.name AS [Queue_Name],
CASE WHEN t4.state IS NULL THEN ''Not available''
ELSE t4.state
END AS [Queue_State],
CASE WHEN t4.tasks_waiting IS NULL THEN ''--''
ELSE CONVERT(VARCHAR, t4.tasks_waiting)
END AS tasks_waiting,
CASE WHEN t4.last_activated_time IS NULL THEN ''--''
ELSE CONVERT(varchar, t4.last_activated_time)
END AS last_activated_time ,
CASE WHEN t4.last_empty_rowset_time IS NULL THEN ''--''
ELSE CONVERT(varchar,t4.last_empty_rowset_time)
END AS last_empty_rowset_time,
(
SELECT COUNT(*)
FROM sys.transmission_queue t6 WITH (NOLOCK)
WHERE (t6.from_service_name = t1.name)
) AS [Tran_Message_Count],
DB_NAME() AS DB_NAME
FROM sys.services t1 WITH (NOLOCK) INNER JOIN sys.service_queues t2 WITH (NOLOCK)
ON ( t1.service_queue_id = t2.object_id )
INNER JOIN sys.schemas t3 WITH (NOLOCK) ON ( t2.schema_id = t3.schema_id )
LEFT OUTER JOIN sys.dm_broker_queue_monitors t4 WITH (NOLOCK)
ON ( t2.object_id = t4.queue_id AND t4.database_id = DB_ID() )
INNER JOIN sys.databases t5 WITH (NOLOCK) ON ( t5.database_id = DB_ID() );')
PRINT ''
PRINT ''
PRINT 'sys.transmission_queue (toal count, group count, and top 500)'
-- Using count against MetaData columns rather than COUNT(*) because it is faster, and we don't need exact counts
PRINT '-- TOTAL COUNT sys.transmission_queue --'
EXEC ('SELECT p.rows as TQ_Count FROM ' + @dbname + '.sys.objects as o join ' + @dbname + '.sys.partitions as p on p.object_id = o.object_id where o.name = ''sysxmitqueue''')
-- EXEC ('SELECT count(*) as TQ_Count FROM ' + @dbname + '.sys.transmission_queue with (nolock)'); -- more accurate count
PRINT ''
PRINT '-- GROUP COUNT sys.transmission_queue --'
SELECT COUNT(*) as TQ_GroupCnt, transmission_status FROM sys.transmission_queue GROUP BY transmission_status
PRINT ''
PRINT 'TOP 500'
print '-- sys.transmission_queue --'
EXEC ('USE ' + @dbname + ';SELECT top 500 conversation_handle, to_service_name, to_broker_instance, from_service_name,
service_contract_name, enqueue_time, message_sequence_number, message_type_name, is_conversation_error,
is_end_of_dialog, priority, transmission_status, DB_NAME() as DB_Name FROM ' + @dbname + '.sys.transmission_queue with (nolock) order by enqueue_time, message_sequence_number');
PRINT ''
print 'sys.conversation_endpoints (total count, group count, and top 500)'
-- Using count against MetaData columns rather than COUNT(*) becuase it is faster, and we don't need exact counts
PRINT '-- TOTAL COUNT sys.conversation_endpoints --'
EXEC ('SELECT p.rows as CE_Count FROM ' + @dbname + '.sys.objects as o join ' + @dbname + '.sys.partitions as p on p.object_id = o.object_id where o.name = ''sysdesend''')
-- EXEC ('SELECT count(*) as count FROM ' + @dbname + '.sys.conversation_endpoints with (nolock)');
PRINT ''
PRINT '-- GROUP COUNT sys.conversation_endpoints --'
EXEC ('SELECT COUNT(*) as CE_GroupCnt, state_desc FROM ' + @dbname + '. sys.conversation_endpoints GROUP BY state_desc')
PRINT ''
PRINT 'TOP 500'
PRINT '-- sys.conversation_endpoints --'
EXEC ('USE ' + @dbname + ';SELECT top 500 *, DB_NAME() as DB_Name FROM ' + @dbname + '.sys.conversation_endpoints with (nolock)');
-- Gather Activation Proc Code
/*
SET QUOTED_IDENTIFIER OFF;
DECLARE @cmd nvarchar(1024)
DECLARE @cmd2 nvarchar(1024)
select @cmd = 'DECLARE tproc_cursor CURSOR FOR select activation_procedure from ' + @dbname + '.sys.service_queues where activation_procedure is not null'
EXEC (@cmd)
OPEN tproc_cursor;
DECLARE @proc sysname;
DECLARE @len int
FETCH NEXT FROM tproc_cursor INTO @proc;
WHILE (@@FETCH_STATUS = 0)
BEGIN
select @proc = rtrim(@proc)
select @len = len(@proc) - 8;
select @proc = substring(@proc, 8, @len)
select @proc
EXEC ("select definition from " + @dbname + ".sys.sql_modules where definition like '%" + @proc + "%'")
FETCH NEXT FROM tproc_cursor INTO @proc;
END;
CLOSE tproc_cursor;
DEALLOCATE tproc_cursor;
SET QUOTED_IDENTIFIER ON;
*/
PRINT ''
PRINT 'End of Database: ' + @dbname
PRINT 'END Time : ' + CONVERT(Varchar(50), GetDate())
PRINT 'Data Collection Duration in milliseconds for ' + @dbname
PRINT ''
SELECT DATEDIFF(millisecond, @StartTime, GETDATE()) as Duration_ms
PRINT '====================================================================================='
PRINT '====================================================================================='
PRINT ''
END;
FETCH NEXT FROM tnames_cursor INTO @dbname;
END;
CLOSE tnames_cursor;
DEALLOCATE tnames_cursor;