Поделиться через


Event ID 2115 A Bind Data Source in Management Group

<!--[if lt IE 9]>

<![endif]-->

Comments

  • Anonymous
    January 01, 2003
    That is used by the warehouse for all maintenance jobs... including aggregation.   To reduce the impact by that job... you can focus on:
  1.  Increasing the disk I/O and server resources for the data warehouse database.
  2.  Reduce the amount of data going into the warehouse, and reduce the retention. You really need to find out exactly what is causing the blocking when this runs... to determine the best course of action.  A SQL DBA with SQL profiler in hand should be able to indentify the major causes... How big is your warehouse?  Agent count?
  • Anonymous
    January 01, 2003
    Kevin - I am experiencing the problems you describe for the CollectEventData workflows. If I follow your workaround, will I be prevent specific types of event collection? Can you provide more detail in what is causing this? Thanks! Megan

  • Anonymous
    January 01, 2003
    The IIS MP has some frequent and noisy discoveries.  Many run every hour.  I like to modify the frequency of those to once per day.

  • Anonymous
    January 01, 2003
    That is from the Exchange 2007 conversion MP - unfortunately - the event is created from script - if you turn off that workflow - you will also turn off the script.  :-( This is not a problem in the new native MP coming out with R2.  That event will be a top consumer - but should not flood the database - it just will be at the top of the list.

  • Anonymous
    January 01, 2003
    I had the same symptoms (and pretty much all of them) as discussed above.  My problem seemed to be that I had inserted an account into the 'Data Warehouse SQL Server Authentication Account' 'Run As Account' where I should have had a space.... as noted here: http://www.eggheadcafe.com/conversation.aspx?messageid=30315729&threadid=30282292 This stopped the 2115 errors immediately.

  • Anonymous
    January 01, 2003
    No - we will simply drop batches of events that get stuck and hold up the queue.

  • Anonymous
    January 01, 2003
    So - that hotfix - 969130 - simply allows dropping of old event tables.  Their existence will not really impact event insertion into the DW - so that is why that didnt work.  Also - that could only possibly affect the DW.CollectEventData 2115, and no others. The MEP table query - dealed with discovery data.  This can be a problem when management packs run discoveries that constantly update discovery data with properties that change frequently.  If your only 2115 is from DW.CollectDiscoveryData - then a deeper analysis of discovered properties is in order. The best queries I have seen for that are here: http://nocentdocent.wordpress.com/2009/05/23/how-to-get-noisy-discovery-rules/

  • Anonymous
    January 01, 2003
    You should not make any overrides for this. The overrides in this article handled a very specific issue with events, and it is NOT applicable to perf collection. If you continuously have issues with perf insertion into the data warehouse - your warehouse is likely not performing well.  Look for blocking, and for avg disk sec/write values.

  • Anonymous
    January 01, 2003
    @Seth - I have heard about this... scalability issues with Unix monitoring.  Some good rules of thumb:

  1.  Dedicate the management server for cross platform monitoring and do not assign Windows agents to it.
  2.  Place the Management server health service queue on very fast disk with large IOPS capability and very low latency (RAID10 with 4 spindles, 15k drives, Tier 1 SAN, etc..)
  3.  Use physical hardware for the MS when scaling at maximum, with more than the minimum hardware requirements available for memory, CPU, etc...
  4.  Be very careful with what you write as far as custom workflows against the cross platform systems, as these can add additional load and will affect scale. My understanding is that we can scale up to 500 cross platform agents per MS when these are met above.... and using the built in MP's for base level Xplat monitoring.  
  • Anonymous
    January 01, 2003
    That override was not a fix to address all 2115's.  It was only to address a specific situation with 2115 events of a Workflow Id : Microsoft.SystemCenter.DataWarehouse.CollectEventData.
  1.  Are ALL (at least 99%) your 2115 events coming from the above workflow?  If they are - then apply this override - and bounce the healthservice on your affected management server (in a cluster - take offline and then back online)  and you might consider clearing out the old healthservice cache.
  2.  If they are NOT all from DataWarehouse.CollectEventData, and are from random sources.... the next step is to see if they are all from a DataWarehouse workflow ID, or some are, some not.  In either case, this is typically SQL database performance related.  Bounce the healthservice and see if these comes back immediately, or if they take some time before you see them.
  • Anonymous
    January 01, 2003
    So - here is what I look at with 2115's.
  1.  Look for a pattern - do the 2115's happen at a specific time or random?  If a pattern - look for other jobs that might be running at that time, like a backup - or SQL DBA maintenance plans.
  2.  Look at the 2115's... do they come from a single datasource/workflow... or multiple?  The workaround I posted only applies if they are ALL from the collectevent and data warehouse workflow.
  3.  Random 2115's with LOW times... (under 300 seconds) are normal... as long as we recover.  If they have longer times associated with them... that is indicative of a SQL perf issue, or blocking on the DB.  SQL perf is caused by keeping too much data in the DB, bad disk/memory I/O, not enough spindles for the DB, DB and logs not being in distinct volumes/spindle sets, poor SAN performance, too many agents in the management group, other jobs stepping on the DB, too many consoles open, etc....
  • Anonymous
    January 01, 2003
    The gateway approval tool failed with following error: “Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'Microsoft.Mom.DataAccessLayer, Version=6.0.4900.0, Culture=neutral, Publi cKeyToken=31bf3856ad364e35' or one of its dependencies. The system cannot find the file specified. File name: 'Microsoft.Mom.DataAccessLayer, Version=6.0.4900.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35'   at GatewayInsertionTool.Program.Main(String[] args)” Any help will be appreciated. Ashutosh

  • Anonymous
    January 01, 2003
    One other question. Can this problem cause agents to receive the event "Alert generated by Send Queue % Used Threshold"?

  • Anonymous
    January 01, 2003
    http://nocentdocent.wordpress.com/2009/05/23/how-to-get-noisy-discovery-rules/

  • Anonymous
    January 01, 2003
    You are correct - I did.  This was a copy/paste from a newsgroup posting.

  • Anonymous
    January 01, 2003
    I believe so - if the management server queue is also blocked.  One customer I worked with had a lot of send queue % alerts.... and these cleared up when we implemented this change.

  • Anonymous
    January 01, 2003
    Thank you! It appears this fix has cleared up my issue as well including the Send Queue Alerts. Thanks!

  • Anonymous
    January 01, 2003
    When did you put in these overrides? If this just started - and the overrides have been in place for some time.... and SQL I/O performance is good... and this is ONLY coming from the warehouse collect event data source - then I would look for:

  1.  Blocking on the SQL server processes - check Activity monitor.... if performance counters look good on SQL - we can still have an insert problem if something is causing blocking.
  2.  Something is flooding events.  Run the most common event query from my SQL query blog - and see if you can determine the source.
  • Anonymous
    August 27, 2008
    I think you missed   Microsoft.SystemCenter.CollectPerformanceData from your workflows.

  • Anonymous
    October 16, 2008
    The comment has been removed

  • Anonymous
    February 13, 2009
    Thx Kevin for this great Blob. I have been experiencing this error for a week now. After about 12 hours of working fine, the warnings 2115 start appearing. After some minutes, there are only 2115 errors in the Log and the RMS turns gray. The Workflow IDs include all possibilities, they seem to appear in a cyclic behavior to each other. The workaround didn't fix it, unfortunatelly. I suspect a SQL performance problem. Before migrating the whole system to a new, better server, I just wanted to make sure if this might be the Problem.

  • Anonymous
    May 14, 2009
    I just started getting these every 5 - 10 minutes. All coming from Microsoft.SystemCenter.DataWarehouse.CollectEventData They come back right away after bouncing the service. No MP's have been added/deleted in about 3 months. The performance of the SQL server looks great The db and logs are on different volumes. I've set the 3 overrides as described above. I've rebooted my SQL server and then the RMS. Any other ideas?

  • Anonymous
    May 14, 2009
    The comment has been removed

  • Anonymous
    May 18, 2009
    Turns out we DO have blocking. This is causing the issue, but I'm not sure what can be done about it: exec StandardDatasetMaintenance

  • Anonymous
    June 10, 2009
    The comment has been removed

  • Anonymous
    June 10, 2009
    I've got a case open with Microsoft on this. I can tell you that one of the things they had me do (which didn't work for me, but may work for you) is to install this hotfix: http://support.microsoft.com/kb/969130 They also had me run this query, which I think tells me where most of my DW writes are coming from? select top 20 met.ManagedEntityTypeSystemName, count() from ManagedEntityProperty mep    join ManagedEntity me on (mep.ManagedEntityRowId = me.ManagedEntityRowId)    join ManagedEntityType met on (me.ManagedEntityTypeRowId = met.ManagedEntityTypeRowId) Where mep.FromDateTime > '2009-01-01' group by met.ManagedEntityTypeSystemName having (count()) > 5 order by 2 desc

  • Anonymous
    June 12, 2009
    Ok, that's good stuff. I ran the 'Discovered Objects in the last 4 hours' query and found that Microsoft.Windows.InternetInformationServices.2003.FTPSite has 36 changes. I can tell you for sure that we have not added any FTP Sites in quite awhile...

  • Anonymous
    July 07, 2009
    Great blog. Just wondering where I can find the 'Discovered Objects in the last 4 hours' query. Thanks.

  • Anonymous
    July 14, 2009
    A Bind Data Source in Management Group ABC has posted items to the workflow, but has not received a response in 122 seconds.  This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.DataWarehouse.CollectEntityHealthStateChange Instance    : MS4.ABC.local Instance Id : {43BE45BE-573D-AD34-B4333-3673F673BE32} This come 4 times within a couple of minuts (first 61 sec, 122, 183 and then 245) - but do also come 20 times in a hour - do have a cluster DB with 64 GB RAM and a cluster RMS with 16 GB and 6 MS with 8 GB. There are (for now) 32 agents connected to one MS and even I move agents to other MS then the events come on the "new" MS. It shouldn´t be performance issue - Any idea?

  • Anonymous
    April 29, 2010
    Worked like a charm for our OpsMgrR2 on VMWARE virtualized environment that is supporting 250 agents!  Thank you.

  • Anonymous
    August 09, 2011
    I just battled this issue on the phone with MSFT for hours... turns out our 181 UNIX servers reporting to a single SCOM MS was causing this.  The UNIX boxes have been reporting to this SCOM MS for 2-3 months so the MSFT Performance team is going to contact me tomorrow to run some tests.... apparently UNIX is a disk I/O hog.  1 UNIX servers = 10 Windows servers on disk I/O.

  • Anonymous
    February 12, 2012
    The comment has been removed

  • Anonymous
    March 25, 2013
    The comment has been removed

  • Anonymous
    March 27, 2014
    Hi Kevin,

    I need your help in this issue. We are litterally getting alert " Management server reached the quota limit" and also unable to discover any TFS build servers. I would request you to help me on this if you have created any blogs for this.

  • Anonymous
    August 13, 2015
    I am getting 2115 events from Workflow Id : Microsoft.SystemCenter.DataWarehouse.CollectEntityHealthStateChange only and are generated about every hour. Datawarehouse SQL performance is also good. Please suggest how to troubleshoot these.

  • Anonymous
    August 13, 2015
    @Daya - if that is the only one you are receiving - that's odd, and usually there will be other events that help us understand whats wrong. I'd recommend opening a support case IF these are values that are incrementing. If the times stay low (less than 5 minutes) you might just be overloaded in the warehouse during aggregations, and you just need to do some tuning or get better disk IO

  • Anonymous
    December 16, 2016
    Hi Kevin,I'm seeing following events in MS server every minute. My SCOM server is not healthy now. How to proceed on this?A Bind Data Source in Management Group EL-OPSMGR has posted items to the workflow, but has not received a response in 69360 seconds. This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.CollectEventDataA Bind Data Source in Management Group EL-OPSMGR has posted items to the workflow, but has not received a response in 69360 seconds. This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.CollectPublishedEntityStateA Bind Data Source in Management Group EL-OPSMGR has posted items to the workflow, but has not received a response in 69360 seconds. This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.CollectAlertsA Bind Data Source in Management Group EL-OPSMGR has posted items to the workflow, but has not received a response in 69360 seconds. This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.CollectPerformanceDataA Bind Data Source in Management Group EL-OPSMGR has posted items to the workflow, but has not received a response in 69359 seconds. This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.CollectSignatureDataA Bind Data Source in Management Group EL-OPSMGR has posted items to the workflow, but has not received a response in 68879 seconds. This indicates a performance or functional problem with the workflow. Workflow Id : Microsoft.SystemCenter.CollectDiscoveryDataRegardssEB

    • Anonymous
      December 16, 2016
      @ Sebastian - that isnt a typical bind - that is NOTHING is writing to your DB.You need to investigate other events - it is likely access denied due to an account failure.
  • Anonymous
    March 14, 2017
    Hi Kevin,Great post, I have followed this and https://support.microsoft.com/en-us/help/2681388/how-to-troubleshoot-event-id-2115-related-performance-problems-in-operations-manager and still seem to be having issues. I wondered if you would mind taking a look at my post here and commenting? https://social.technet.microsoft.com/Forums/systemcenter/en-US/b3b760ca-f0e1-4b3e-bf70-d542044f0a69/event-id-2115?forum=operationsmanagergeneral#b3b760ca-f0e1-4b3e-bf70-d542044f0a69

  • Anonymous
    June 16, 2017
    This is a fantastic post! very informative!Although I am still in a little bit of a problem myself with this one, even after reading your great post here. I'm wondering if maybe you could help me out.See in my case I don't have the Microsoft.SystemCenter.DataWarehouse.CollectEventData issue that you have, so your fix doesn't work for me.So, I took your advice and I started analyzing the workflows, pending times and frequency of these events on all of my management servers:My top 5 offenders are the following with their respective event 2115 counts.Count - WorkflowId5932 - Microsoft.SystemCenter.CollectPerformanceData5931 - Microsoft.SystemCenter.CollectDiscoveryData5930 - Microsoft.SystemCenter.CollectSignatureData3247 - Microsoft.SystemCenter.CollectPublishedEntityState2497 - Microsoft.SystemCenter.CollectAlertsMy grand total of events is 26375 across all my management servers and only 351 of these events are at or under the 120-second mark. The rest of the events hike all the way up to 15,540~1,041,840 seconds.Any ideas on what I should do with my situation here?

    • Anonymous
      June 17, 2017
      Those all point to the OpsDB.Either something is terribly wrong with your OpsDB performance, or you have significant blocking, or you have a permission issue and your SDK account is unable to write to the DB.
  • Anonymous
    July 18, 2018
    The comment has been removed