Viewing crawl logs in SharePoint Online (Office 365)
The low-level details of crawling and indexing are not obviously available in a SharePoint Online site, but can be exposed via an eDiscovery Center site. The procedure detailed below describes how this information can be made accessible to a SharePoint administrator.
In SharePoint Online in an Office 365 Enterprise environment there is no direct administrative control over the configuration of crawling and indexing. Unlike an on-premises SharePoint 2013 farm, there is neither visible Search Service Application (SSA), no Content Sources to configure, nor any crawl rules to test and deploy. Despite these differences the content of an SPO tenant does get crawled and indexed, often becoming searchable within an hour of being added to a site.
While there is no explicit management page available with a button marked "Crawl Logs", the equivalent information is available via an eDiscovery Center site. The following steps show how to set up a simple case that will specify a global source (i.e. the root site collection) and a generic query of just the tenant's domain name (pfecloud), which will capture the crawl/indexing details for the entire SharePoint Online site. A larger organization with several large subsites might want to set up multiple cases in order to gather crawling details at more fine-grained level, especially if multiple administrators are involved, each with a different focus and timeframe of interest.
1. Create an eDiscovery Center site collection, as a new private site collection with the appropriate Enterprise template.
2. Create a new Case to get started, with an easy to recognize name.
3. Within this new case, we'll configure first a Source, specifying the root site collection path, and then a Query. In our case, we'll use a general term to match all entries, using the tenant name that becomes part of the hostname in each page's URL.
After we run the search, the SharePoint tab is populated with the matches, which should be every site, list, and document in this SharePoint Online site. This shows all the URLs that will be reported in the crawl logs.
4.The next step exports and downloads the report for this case. The exporter generates a directory structure that includes CSV files (readable with Excel) that separately record successful and failed crawl and indexing attempts. There is an Export option directly from the bottom of the Query page after the search has completed, or it can be created later as a new item in the Exports list of the eDiscovery Center.
The wizard prompts for the Query (created above) and then offers several options, enabling us to include multiple versions of SharePoint documents and items that might otherwise be left out of the report.
Once created we are given the option to download Results or Report. The Results consists of all the matching contents of the search, which can be a significant amount of data. The crawl log information is part of the Report download.
The selection starts a Download Manager process, which prompts for storage location and requires a login to the Office 365 site for authentication.
5.The summary provides an overview of the Search and Export procedure. The two remaining files make up the Crawl Log data that is neatly divided into failures and successes.
If at first glance the data in the Errors spreadsheet looks somewhat bizarre, it may just be a function of the default column width.
This misalignment is easily corrected via the column auto-fit, by A) clicking on the select all tag in the upper left corner of the worksheet, and B) double-clicking any one of the column dividers.
The SharePoint Results file shows those documents successfully crawled and indexed from the SPO site, and some metadata associated with them. While the details do not align precisely with the information available an on-premises Crawl Log (e.g. there is no per-URL "Last Time Crawled" or Crawl Rate for a Content Source), there is sufficient detail to understand what is and isn't searchable for a SharePoint Online site, and in the case of errors a "Last Crawl Attempt" timestamp.
References:
"View search diagnostics in SharePoint Server 2013"
- https://technet.microsoft.com/en-us/library/jj219611.aspx
"eDiscovery in Office 365"
- https://technet.microsoft.com/en-us/library/dn790281.aspx
"Set up an eDiscovery Center in SharePoint Online"
"Export eDiscovery content and create reports"
Comments
- Anonymous
November 03, 2015
Cross-posting for Dan Harrington and his Office 365 Term-of-the-Day blog.
The low-level details of - Anonymous
March 25, 2016
Thanks for the best information. It’s a very nice topic. We IT hub Online Training are good in giving the http://www.ithubonlinetraining.com/sharepoint-online-training/ ">Sharepoint training