Share via


SharePoint 2016: retrieving crawl results using PowerShell

Introduction

Crawl results of a site collection revealed a significant number of warnings and errors needing attention by the site collection administrator.  To facilitate the site collection administrator's review and analysis, site collection crawl results were exported to a spreadsheet via PowerShell and then provided to the site collection administrator in this format. This tip shows how to do this. Note the following presumptions:

  • The farm has a single search service application deployed
  • The site collection is hosted in a dedicated web application
  • Farm search content sources are individualized by web application

Get content source crawl log references

These first statements get references to the target content source and that content source's crawl log.  These references are needed to retrieve key information associated with the crawl log.

[void][System.Reflection.Assembly]::LoadWithPartialName("Microsoft.Office.Server.Search.Administration.CrawlLog")

$ssa = Get-SPEnterpriseSearchServiceApplication -Identity "[search service application name]"

$ContentSource = Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa | ? { $_.Name -eq "[content source name]"}

$log = New-Object Microsoft.Office.Server.Search.Administration.CrawlLog $ssa

Get content source crawl status

This statement gets a listing of content sources and the status of their crawls:

Get-SPEnterpriseSearchCrawlContentSource -SearchApplication $ssa

and use this statement to get that same information for the single content source identified earlier:

$ContentSource

Get content source crawl statistics

This statement gets list of the total number of crawl successes, warnings, errors, etc for a specific content source.  It returns a listing of the same information seen when navigating to the search service application's Crawl Log - Content Source page, but just for a specific content source.

$log.GetCrawlStatisticsByHost("[content source name]")

Export content source crawl log results

This statement exports all of the crawl results for a single content source into a CSV file

$log.GetCrawledUrls($false, 1000000, $null, $false, $ContentSource.Id, -1, -1, [System.DateTime]::MinValue, [System.DateTime]::MaxValue) | Export-CSV -Path "D:\Temp\CrawlResults.csv" -NoTypeInformation

To export a listing of just those successful crawl results, use this:

$log.GetCrawledUrls($false, 1000000, $null, $false, $ContentSource.Id, 0, -1, [System.DateTime]::MinValue, [System.DateTime]::MaxValue) | Export-CSV -Path "D:\Temp\CrawlResults.csv" -NoTypeInformation

To export a listing of all crawl results for all content sources provisioned for a search service application, use this:

$log.GetCrawledUrls($false, 1000000, $null, $false, -1, -1, -1, [System.DateTime]::MinValue, [System.DateTime]::MaxValue) | Export-CSV -Path "D:\Temp\CrawlResults.csv" -NoTypeInformation

Additional details on useful filtering parameters can be found in the references.

References

Notes

  • tbd