Compartilhar via


Exporting SharePoint 2010 Search Crawl logs

In SharePoint 2007 we provided an object model way of accessing the SharePoint crawl logs. We have a LogViewer class [https://msdn.microsoft.com/en-us/library/microsoft.office.server.search.administration.logviewer(v=office.12).aspx] and a sample application that uses this is available here - https://msdn.microsoft.com/en-us/library/cc751807(office.12).aspx

This LogViewer class is still present in SharePoint 2010 and documented at https://msdn.microsoft.com/en-us/library/microsoft.office.server.search.administration.logviewer(v=office.14).aspx.

Note: This functionality is marked as obsolete. This means that in a future product release we might change or remove this functionality completely.

Powershell scripting makes using this functionality so much easier. Here are two samples that use power shell scripting to export the SharePoint crawl logs,

============================================================================
Powershell script to pull all the crawl logs and display based on errorId
============================================================================
#Replace "Search Service Application" in the script with the exact name of the SSA that you browse to for viewing the crawl log.
#With FAST you have multiple Search SSA’s and hence specify the name of the SSA that you use to view the crawl log data.
$ssa = Get-SPEnterpriseSearchServiceApplication | Where-Object {$_.Name -eq "Enter Name of Search Service Application which has the Crawl Log Data"}
#This should list only one SSA object.
$ssa
#Create a LogViewer object associated with that SSA
$logViewer = New-Object Microsoft.Office.Server.Search.Administration.Logviewer $ssa
#Get a List of all errors/warnings in the Crawl Log
$ErrorList = $logViewer.GetAllStatusMessages() | Select ErrorId
#Loop through each type of error and pull that data
Foreach ($errorId in $ErrorList)
{
    $crawlLogFilters = New-Object Microsoft.Office.Server.Search.Administration.CrawlLogFilters
    #Filter based on the Error Id
    $crawlLogFilters.AddFilter(“MessageId”, $errorId.errorId)       
    "Pulling data for Message ID : " + $errorId.errorId
    $nextStart = 0
    $urls = $logViewer.GetCurrentCrawlLogData($crawlLogFilters, ([ref] $nextStart))
    #Data from the crawl log will be available in the DataTable $urls. If this number is larger than the number of records requested (50 by default), then use only 50 records, ignore the rest.
    $urls.Rows.Count
    WHILE($nextStart -ne -1){$crawlLogFilters.AddFilter(“StartAt”, $nextStart);$nextStart = 0;$urls = $logViewer.GetCurrentCrawlLogData($crawlLogFilters, ([ref] $nextStart));
$urls.Rows.Count}
}

============================================================================
Powershell script to filter based on the Url
============================================================================
$ssa = Get-SPEnterpriseSearchServiceApplication | Where-Object {$_.Name -eq "Search Service Application"}
$ssa
$logViewer = New-Object Microsoft.Office.Server.Search.Administration.Logviewer $ssa
$logViewer.GetAllStatusMessages()
$crawlLogFilters = New-Object Microsoft.Office.Server.Search.Administration.CrawlLogFilters
$urlProp = New-Object Microsoft.Office.Server.Search.Administration.CrawlLogFilterProperty
$urlProp = [Microsoft.Office.Server.Search.Administration.CrawlLogFilterProperty]::Url
$stringOp = New-Object Microsoft.Office.Server.Search.Administration.StringFilterOperator
$stringOp = [Microsoft.Office.Server.Search.Administration.StringFilterOperator]::Contains
$crawlLogFilters.AddFilter(“MessageId”, 377)
$crawlLogFilters.AddFilter($urlProp, $stringOp,"serverurl")
$i = 0
$urls = $logViewer.GetCurrentCrawlLogData($crawlLogFilters, ([ref] $i))
#Data from the crawl log will be available in the DataTable $urls. 
$urls.Rows.Count

============================================================================

Comments

  • Anonymous
    October 26, 2011
    This is all good but what is the replacement for Logviewer in SP 2010? I need a similar functionality but don't want to use an obsolete class. I see the CrawlLogFilters is still there but how do we utilize it without LogViewer? Any idea?

  • Anonymous
    November 09, 2011
    The LogViewer class is obsolete but still works in SP2010. You may continue to use it for now, though this means that in a future product release we might change or remove this functionality completely.  

  • Anonymous
    October 31, 2013
    Hi, Microsoft.Office.Server.Search.Administration.Logviewer class is marked as Obsolete in the new version, Does anyone knows what is the new version of LogViewer class? Regards, Sudhir

  • Anonymous
    May 17, 2016
    How can I filter the crawl log based on dates?

  • Anonymous
    May 17, 2016
    How can I export the crawl log based on dates?

    • Anonymous
      September 26, 2016
      Hello Gerg,Below script will may help you.$errorsFileName = ".\AllCrawlLogs.csv"$ssa = Get-SPEnterpriseSearchServiceApplication -Identity $logs = New-Object Microsoft.Office.Server.Search.Administration.CrawlLog $ssa$logs.GetCrawledUrls($false,10000,"",$false,1,2,-1, [System.DateTime]::MinValue, [System.DateTime]::MaxValue) | export-csv -notype $errorsFileNameThe ParametersThe 'GetCrawledUrls' method has the following parameters GetCrawledUrls (bool getCountOnly, long maxRows, string urlQueryString, bool isLike, int contentSourceID, int errorLevel, int errorID, DateTime startDateTime, DateTime endDateTime)1. Return Value - DataTable2. getCountOnly - If true, returns only the count of URLs matching the given parameters.3. maxRows - This parameter specifies the number of rows to be retrieved.4. urlQueryString - The prefix value to be used for matching the URLs5. isLike - If true, all URLs that start with 'urlQueryString' will be returned.6. contentSourceID - This is the ID of the content source for which crawl logs should be retrieved. If -1 is specified, URLs will not be filtered by content source.7. errorLevel - Only URLs with the specified error level will be returned. Possible Values - -1 : Do not filter by error level.0 : Return only successfully crawled URLs.1 : Return URLs that generated a warning when crawled.2 : Return URLs that generated an error when crawled.3 : Return URLs that have been deleted.4 : Return URLs that generated a top level error.8. errorID - Only URLs with this error ID will be returned. If -1 is supplied, URLs will not be filtered by error ID.9. startDateTime - Start Date Time. Logs after this date are retrieved.10. endDateTime - End Date Time. Logs till this date are retrieved.Reference: http://spdeveloper.co.in/sharepoint2013/export-search-crawl-log-to-excel.aspx