eDiscovery in-Place Hold Processing – Behind the Scenes
Intro
I’m excited to announce that I recently transferred internally to a Senior Search Premier Field Engineer role at Microsoft. I’ll continue to post general SharePoint topics, but expect to see some hardcore Search Posts mixed in. My blog posts will also now appear in multiple places on msdn so just a heads up. I’m writing about eDiscovery in-Place Hold Processing in this particular posts because Search is essentially the backbone of eDiscovery and without Search, eDiscovery wouldn’t happen. I started to look a little deeper into eDiscovery in order to understand the following question:
What is the magic that occurs behind the scenes?
eDiscovery transactions like the following:
- Placing holds on content
- Validating a site is truly on hold
Exposing behind the scenes activity with the above 2 transactions will give the reader good insight into how eDiscovery leverages a Search Service Application. Before diving into behind the scenes activity, I’d like to briefly go over the basics. This blog is geared toward SP2013 on premise deployments.
eDiscovery Basics
The entire purpose of eDiscovery is giving power to a select group of people (eDiscovery Team) in order to discover and export content from various repositories. In our case, Exchange Mailboxes and SharePoint Site data. A good eDiscovery implementation will masks activities from end users. That is, regular users should be able to continue working on SharePoint data (documents, spreadsheets, etc..) without interruption or awareness that their data is on hold and can be exported at any time behind the scenes.
The eDiscovery setup is fairly simple and plenty of documentation exists on how to set it up. I won’t cover every detail I recommend leveraging the documentation already available if you have any unanswered setup questions. In the most simple setup steps, the following occurs:
1. eDiscovery Site Collection is provisioned
2. Within eDiscovery Site, Case is Created (this is a subsite)
3. Within the Case (subsite), create a Document Set
a. Sources are defined within Document Sets
b. In Place holds are enabled within Document Sets
4. Within the Case, Queries are performed against content already placed on hold in order to export that data off to the file system
Okay so now that you have the bare bones basics let’s go into a deep dive on the following transactions:
- Placing holds on content
- Validating a site is truly on hold
Invoking In-Place Holds
Within an eDiscovery Case, you have the ability to create a Discovery Set which is where you define sources (SharePoint sites) you want to place on hold and enable in-Place holds. It looks like:
The first step is adding a Managed Source so clicking on “Add & Manage Sources”, I’ll define the URL to my lunch site collection:
Clicking on the folder with the check box calls into the Search Service Application in order to ensure the site has been indexed. Assuming the site has been indexed, it comes back with a green check with the title of the site.
Clicking okay will add the source to the eDiscovery Set. So this is the first clue that Search Service Application is vital in that without the site being indexed and available to search queries, you wouldn’t be able to add it as a source therefore place holds on content. The next step is defining a filter to place specific content on hold, I’m skipping that step and want everything on hold. Finally, will set In-Place Hold to Enable In-Place Hold.
Clicking Save, will return back to the eDiscovery Case with my newly created Discovery Set.
Notice, the In-Place Hold Status is processing. This is where the magic happens and where I’ll dig a little deeper to understand how does processing work and the content moves from “Processing” to “On Hold” Status. I’ll piece this together in two steps.
Step 1: Create Discovery Actions
When we enable in-place holds within an eDiscovery set and click save, we call into the Search Service Application and add Discovery Actions to the Search Admin Databases MSSDiscovery Action Table. So if In-Place Hold Status is in Processing state, likely performing a select query against this table will enumerate data:
select * from MSSDiscoveryAction with(NOLOCK)
Note: Modifying/Editing any SP Database is not supported and will put your Farm in an unsupported state
Question: How can validate this is happening besides checking the SQL DB?
Answer: ULS Logs!
02/20/2015 11:04:02.61 w3wp.exe (0x0F38) 0x12F8 Document Management Server Discovery aaykw Verbose Beginning update of source group f0a1eb9c-dc8a-a041-f082-9b075765c29d
02/20/2015 11:04:02.61 w3wp.exe (0x0F38) 0x12F8 Document Management Server Discovery aaykx Verbose Preservation changed f0a1eb9c-dc8a-a041-f082-9b075765c29d
02/20/2015 11:04:02.88 w3wp.exe (0x0F38) 0x12F8 Document Management Server Discovery ab2zt Verbose Sending 1 actions to the discovery web service f0a1eb9c-dc8a-a041-f082-9b075765c29d
02/20/2015 11:04:02.88 w3wp.exe (0x0F38) 0x12F8 SharePoint Foundation Topology e5mc Medium WcfSendRequest: RemoteAddress: 'https://app1:32843/00abdc8662a148f281b00e0d072af186/SearchService.svc' Channel: 'Microsoft.Office.Server.Search.Administration.ISearchServiceApplication' Action: 'https://tempuri.org/IDiscoveryServiceApplication/InvokeDiscoveryActions' MessageId: 'urn:uuid:74a01428-a7ea-4fb8-8d6b-d5a5a62bb310' f0a1eb9c-dc8a-a041-f082-9b075765c29d
02/20/2015 11:04:02.91 w3wp.exe (0x1D40) 0x1D74 SharePoint Server Database tzkv Verbose SqlCommand: 'proc_MSS_AddDiscoveryActions' CommandType: StoredProcedure CommandTimeout: 0 Parameter: '@DiscoveryActions' Type: Structured Size: 0 Direction: Input Parameter: '@TimeNow' Type: DateTime Size: 0 Direction: Input Value: '02/20/2015 17:04:02' Parameter: '@RetVal' Type: Int Size: 0 Direction: ReturnValue Value: '' f0a1eb9c-dc8a-a041-f082-9b075765c29d
Step 2: Wait or Run the eDiscovery In-Place Hold Processing Timer Job
The eDiscovery In-Place Hold Processing timer job is responsible for putting content on hold. While the title of the timer job is eDiscovery In-Place Hold Processing, the actual name is Preservation Processing Job. They are one in the same. The timer job by default runs hourly so that explains the question of why does it take up to an hour for eDiscovery Sets to change from Processing status to On Hold status. This timer job has other functionality but I’m primarily focused on its transactions when placing holds on SharePoint sites. First, it retrieves any pending discover actions by calling into the Search Service Application and retrieving data from the Search Admin DB’s MSSDiscoveryAction table. In our case, we have an action to go place a hold on a site. Now that it has the site, it will go and activate the Preservation Feature within the site. I’ll talk about the Preservation Feature in the next section. Once the preservation feature site is activated, the site is considered on hold. Next, the timer job cleans up the processed discovery actions by calling back into the Search Service Application and purging them from the Search Admin DB’s MSSDiscoveryAction table.
Question: What do these transactions look like in ULS?
Answer:
Timer Job Starts
02/20/2015 11:04:31.69 OWSTIMER.EXE (0x0778) 0x0DA0 Document Management Server Discovery aa45f High Preservation Timer job has started. f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
Fetching Discovery Actions
02/20/2015 11:04:31.73 OWSTIMER.EXE (0x0778) 0x0DA0 SharePoint Server Search Query dka1 High SearchServiceApplicationProxy::GetDiscoveryActions--Proxy Name:Search Service Application EndPoint: https://app2:32843/00abdc8662a148f281b00e0d072af186/SearchService.svc f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:31.74 OWSTIMER.EXE (0x0778) 0x0DA0 SharePoint Foundation Topology e5mc Medium WcfSendRequest: RemoteAddress: 'https://app2:32843/00abdc8662a148f281b00e0d072af186/SearchService.svc' Channel: 'Microsoft.Office.Server.Search.Administration.ISearchServiceApplication' Action: 'https://tempuri.org/IDiscoveryServiceApplication/GetDiscoveryActions' MessageId: 'urn:uuid:d2108254-889b-45e4-bb51-3af827b61286' f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:31.77 w3wp.exe (0x14E0) 0x1514 SharePoint Server Database tzkv Verbose SqlCommand: 'proc_MSS_GetDiscoveryActions' CommandType: StoredProcedure CommandTimeout: 0 Parameter: '@TimeNow' Type: DateTime Size: 0 Direction: Input Value: '02/20/2015 17:04:31' Parameter: '@RetVal' Type: Int Size: 0 Direction: ReturnValue Value: '' f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
Retrieved Discovery Actions and Creating Hold by Activating Preservation Feature on Site
02/20/2015 11:04:31.84 OWSTIMER.EXE (0x0778) 0x0DA0 Document Management Server Discovery aa45l Verbose Processing CreateHold preservation action for
site: https://brew.contoso.com, case ID: 579c3f2f-ce88-4b24-8fd6-9326a8752b9a, hold ID: df763eb6-fb2f-4d4f-ab32-fc4cf5c6371b f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:32.04 OWSTIMER.EXE (0x0778) 0x0DA0 SharePoint Foundation General 88jb Medium Feature Activation: Activating Feature 'Preservation' (ID: 'bfc789aa-87ba-4d79-afc7-0c7e45dae01a') at URL https://brew.contoso.com. f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:32.40 OWSTIMER.EXE (0x0778) 0x0DA0 SharePoint Foundation General 75f8 Medium Feature Activation: Feature 'Preservation' (ID: 'bfc789aa-87ba-4d79-afc7-0c7e45dae01a') was activated at URL https://brew.contoso.com. f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:32.40 OWSTIMER.EXE (0x0778) 0x0DA0 Document Management Server Discovery acaro Verbose Feature activation succeeded for bfc789aa-87ba-4d79-afc7-0c7e45dae01a f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
Post Discovery Action Processing and Timer Job Complete
02/20/2015 11:04:35.12 OWSTIMER.EXE (0x0778) 0x0DA0 SharePoint Server Search Query dka1 High SearchServiceApplicationProxy::ClearDiscoveryActions--Proxy Name:Search Service Application EndPoint: https://app1:32843/00abdc8662a148f281b00e0d072af186/SearchService.svc f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:35.12 OWSTIMER.EXE (0x0778) 0x0DA0 SharePoint Foundation Monitoring nasq Verbose Entering monitored scope (ClearDiscoveryActions). Parent Timer Job PreservationProcessingJob f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:35.18 w3wp.exe (0x14E0) 0x1514 SharePoint Server Database tzkv Verbose SqlCommand: 'proc_MSS_ClearOldDiscoveryActions' CommandType: StoredProcedure CommandTimeout: 0 Parameter: '@CutoffDate' Type: DateTime Size: 0 Direction: Input Value: '01/21/2015 17:04:35' Parameter: '@RetVal' Type: Int Size: 0 Direction: ReturnValue Value: '' f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:35.18 OWSTIMER.EXE (0x0778) 0x0DA0 Document Management Server Discovery aa45k High Preservation Service has finished. Time Spent: 00:00:03.4851716 f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:35.19 w3wp.exe (0x1D40) 0x1D74 SharePoint Foundation Monitoring b4ly Medium Leaving Monitored Scope (ExecuteWcfServerOperation). Execution Time=22.7491 f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:35.20 OWSTIMER.EXE (0x0778) 0x0DA0 SharePoint Foundation Monitoring b4ly Medium Leaving Monitored Scope (Timer Job PreservationProcessingJob). Execution Time=3591.9877 f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
02/20/2015 11:04:33.91 OWSTIMER.EXE (0x0778) 0x0DA0 Web Content Management Publishing mil3 Verbose Adding event receiver with name=PreservationWebEventReceiver and type=202. f7a1eb9c-5cb7-a041-c352-5aa16e601c7d
Note: The event receivers kick in whenever items are added, changed, or deleted and copies the latest change into the Preservation Hold Library which is leveraged when exporting content from site via the eDiscovery Case.
Question: Wait, when does this Preservation Hold Library get created?
Answer: After a site is placed on hold, the Preservation Hold Library isn’t provisioned by the Preservation Feature. The library is provisioned after the first item is added, changed, or deleted. The event receiver has logic that will check to see if the library is present. If not, it will create the Preservation Hold Library.
Site Hold Validation
Validating a site is on hold is a simple yet important step to ensure the transactions I defined above are processed without errors. The easiest way to validate is simply refer to the associated eDiscovery Case site and review the In-Place Hold Status Section:
It’s also possible to validate a site is on hold with some Power Shell. Here are some examples:
Preservation Feature Active
$site = get-spsite “url of site collection”
$feature = get-spfeature –site $site | where{$_.DisplayName –eq “Preservation”}
$feature.status
Looks like:
Preservation Event Receivers Present
$site = get-spsite “url of site collection”
$site.eventreceivers | select name
Looks like:
Preservation Hold Settings List is Present
$web = get-spweb “url of site”
$list = $web.lists | where{$_.Title –eq “Preservation Hold Settings”}
$list.title
Looks like:
Alternative Approach
It’s possible to call getallholds() from the Microsoft.Office.RecordsManagement.Preservation.HoldSettings class by instantiating the object and passing $site in as a parameter.
$site = get-spsite “url of site collection”
$record = New-Object Microsoft.Office.RecordsManagement.Preservation.HoldSettings($site)
$record.getallholds()
Looks like:
Conclusion
That’s it for in-Place hold processing behind the scenes. So to recap, in place hold processing leverages Search in the following ways:
- Adding sources within eDiscovery sets calls search to ensure the SharePoint site/sites are indexed
- Search holds eDiscovery Actions
- eDiscovery In-Place Hold Processing Timer Job leverages Search to process eDiscovery Actions
Finally, to track the activity in ULS like I did, set the following to Verbose:
Document Management Server\Discovery
SharePoint Server\Database
Web Content Management\Publishing
SharePoint Foundation\Fields
SharePoint Foundation\General
Thanks,
Russ Maxwell, MSFT