ACS Event Retention Mechanism
I get a lot of questions about how ACS event retention works. So here you go, I'm blogging it so I can just answer with a link :-)
There are two DWORD registry values which affect backlog transmission. Both are on the collector machine under HKLM\System\CurrentControlSet\Services\AdtServer\Parameters.
EventRetentionPeriod, if present, is expressed in hours (I forget the default). It takes precedence over MaximumEventAge, which is in days (default=1). Both of these values control the backlog of events that will be sent from agents to the collector on agent connect, but as mentioned, EventRetentionPeriod wins any conflict. MaximumEventAge used to control database retention in early beta builds but does not anymore, since the database moved to a partitioning mechanism. You might encounter MaximumEventAge if you are migrating from ACS beta to Operations Manager 2007 ACS.
Grooming is now governed entirely by the grooming algorithm. The grooming algorithm is simple: partitions will be deleted by the next grooming job as soon as they are eligible for deletion.
Eligible for deletion means:
- dtPartition.Status == 2 AND
- dtPartition.LastCreationTime < (now() - (partitionDuration * numPartitions))
Think of (partitionDuration * numPartitions) as the retention period before data is groomed from the database.
- partitionDuration = dtConfig[5]
- numPartitions = dtConfig[6]
Note that dtPartition[<partitionId>].LastCreationTime defaults to 12:00am 1/1/2000 (collector local time). After successful execution of the close partition script, this field’s value is set to max(dtEvent_<partitionId>.CreationTime) for the partition in question. There is an implication here that if you update status to 2 without updating LastCreationTime, then the partition is immediately eligible for grooming assuming your clock is accurate.
The partition switch offset (time of day to switch partitions) value in dtConfig has no effect on grooming, other than that grooming will not occur during a partition switch.
Grooming runs at startup and immediately after checkpointing. The default checkpoint interval is 198 seconds but this interval can be configured by the DWORD registry value CheckPointInterval on the collector, in the same location as the other registry values. A successful checkpoint logs an event in the database, event ID 0 with a source of “_acs” (you might have seen these on an “idle” ACS and wondered how they got there…)