Share via


SharePoint 2013: Case Study on Optimizing the Search Crawl Interval


Introduction

This posting captures the steps performed with regard to optimizing the search crawl interval for a small production SharePoint 2013 farm. Content size included approximately 100,000 items in two content databases, totaling approximately 60 GB. The customer farm was approximately 320 users, of which about 30% engaged the farm on any given workday. Working hours were from 7 AM to 7 PM.  Query usage was minimal. However, a number of new search-driven web parts had been implemented and other, content query web parts were being migrated to search-driven implementations. The move to search-driven web parts was precipitated by an earlier upgrade to SharePoint 2013.

Originally, a crawl interval of 2 hours had been adequate. However, after the migration to 2013, the customer-implemented search-driven web parts required a more a current index. Before implementing a shorter crawl interval, the relative impacts of decreasing the crawl interval and even using the continuous crawl interval, were explored so as to identify what the best crawl interval in fact was.

Much more data was collected then is presented here, which presents just the most salient charts and data that were collected. The study found that a crawl interval of 30 minutes was optimal for the customer content. Further decreases in the crawl interval did not yield any significant gains in crawl freshness - neither did the use of continuous crawling, which was an unexpected result.

Method

Two tools were used to obtain the data for this analysis, SharePoint Search Crawl Health Reports and Windows Server Performance Monitor.

Data

The first step in this study was to obtain relevant baseline configuration data and performance data associated with the existing crawl intervals. Baseline configuration data was for a single local content source.

Baseline Configuration

Search Component Topology

Server Name Admin Crawler Content Processing Analytics Processing Query Processing Index Partition
APP1    
WFE1        
WFE2        

Crawling Intervals

Type Schedule
Full 10 PM Sundays
Incremental 7 AM - 7 PM, every two hours, Monday - Friday

Baseline Performance

One Day Crawl History

Category Chart Comment
CPU and Memory

This chart shows the impact of crawling on CPU and memory usage, over the typical workday, from 7 AM to 7 PM.  The spikes in the brown line are the semi-hourly (every 2 hour) incremental crawls. Memory usage averaged 50%. The memory allocation shown here is not solely for crawl but for all services.  The dip in the green line is the daily SharePoint Timer reset.
Processes
MSSDmn is the daemon process that loads the required protocol handler needed for connecting to, fetching the content and then passing it to MSSearch process for further processing.  MSSearch is the the search engine process. NodeRunner is the process handling crawled items.
Percentage CPU Usage

Note the one-to-one correspondence between NodeRunner memory usage in the above graph with CPU usage in this one.

Analysis: these one-day charts represent that typical usage for M-F workdays that was observed.

Crawl History Over 8 Days

Category Chart Comment
CPU and Memory

These 8-day charts present fairly consistent CPU and memory impacts of crawl component.
Processes
Not sure what the generally increasing nodeRunner resource usage is due to.
Percentage CPU Usage

CPU usage frequently peaks at 60% for full crawls. Incremental crawl peaks are peaking at 30%.

Analysis: these results demonstrated to me that application server load was well within acceptable limits, and that these results were consistent.

Performance Testing

60 Minute Incremental Crawls

In this test, the incremental crawl interval was changed to 1-hour, and then this change in crawl interval was monitored over a workday period and then for a one week period.

Category Chart Comment
Processes

Resource usage by Search processes over a typical workday period (7-7) remain consistent with baseline.
Percentage CPU Usage

Workday period percentage CPU usage peaks seem elevated a bit but still generally generally consistent with baseline.
Percentage CPU Usage

Looking at this over a full week, percentage CPU usage peaking is definitely elevated a bit, but not significantly.

Reviewing the Crawl Freshness results found them to be consistent with a one-hour crawl interval:

 Summary  Distribution by Freshness
Content Source
Aggregate Freshness
# Documents
<10m
<30m <1h <4h <12h <1D <2D <3D >3D
Local SharePoint sites
<1h
155
17%
60%
95% 98% 98% 98% 98% 100% 100%

(m = minute, h = hour, D = day)

The way to interpret this table is something like: "17% of all changes made to documents in SharePoint during the last 10 minutes are fully indexed; 60% of all changes to documents made during the last half hour are fully indexed..." and so on.

Analysis: No significant crawl-related adverse trends and impacts observed for this crawl interval.

30-minute Incremental Crawls

In this performance test, the crawl interval was changed to 30 minutes, and then monitored over a one day period

Category Chart Comment
CPU and Memory

This particular chart is provided to help illustrate the regular crawl intervals and to show the slight elevation of total CPU usage associated with the first incremental crawl of the day.  In general, total CPU and Memory usage over a typical workday remain fully consistent with baseline. 
Processes

Resource usage by Search processes, over a typical workday, show no substantive difference even though the crawl was significantly reduced from 2 hours to half an hour.
Percentage CPU

Same for percentage CPU and memory usage over a typical workday.  Regular spacing of CPU usage spikes consistent with the initiation of incremental crawls.

Review of the Crawl Freshness results found them consistent with a 30-minute crawl interval:

 Summary  Distribution by Freshness
Content Source
Aggregate Freshness
# Documents
<10m
<30m <1h <4h <12h <1D <2D <3D >3D
Local SharePoint sites
<30m
200
53%
96%
100% 100% 100% 100% 100% 100% 100%

Analysis: no significant crawl-related adverse trends and impacts observed for reducing the incremental crawl interval to 30 minutes; content freshness improved.

15-Minute Incremental Crawls

In this performance test, the crawl interval was changed to 15 minutes, and then monitored over a one day period

Category Chart Comment
CPU and Memory

No significant change observed in total CPU and memory  for 15 minute incremental crawl intervals over a typical workday.  Incremental crawls are regularly spaced.
Processes

And no significant changes observed in Search process resource usage over the same workday period.
Percentage CPU

However, some irregularity in percentage CPU usage is becoming apparent.  Without understanding the underlying technical aspects of Search processes, it seems that this chart may be surfacing a decline in the ability of content processing to remain current with crawling and that there is beginning to be some overlap.
Percentage CPU

This chart explores this more deeply by examining shorter time intervals in the chart.  In this chart, the time interval has been narrowed to one incremental crawl interval, or 15 minutes.  Note how percentage CPU usage associated with content processing is increasingly squeezed in between successive incremental crawls.  Previous charts showed content processing completing well before the onset of the next incremental crawl.  However, as the crawl interval is decreased, there is less and less time to initiate and complete processing of the crawled content and efficiency seems to decline.

Review of the Crawl Freshness results found them inconsistent with a 15-minute crawl interval.  While there was minor improvement in the freshness distribution, the aggregate freshness did not improve, and this finding seems consistent with what was revealed in the percentage CPU chart above.  This was an unexpected outcome.

 Summary  Distribution by Freshness
Content Source
Aggregate Freshness
# Documents
<10m
<30m <1h <4h <12h <1D <2D <3D >3D
Local SharePoint sites
<30m
88
78%
100%
100% 100% 100% 100% 100% 100% 100%

These results invited wider analysis, so, Windows Performance Monitor was used to capture CPU usage on the primary WFE and the results analyzed.

Category Chart Comment
PerfMon: CPU Usage

As a matter of interest, the % Processor Time counter was used in PerfMon to capture CPU usage data from a OS perspective as opposed to an application (SharePoint) perspective.  The % Processor Time (total) counter is shown in purple.

Analysis: no significant crawl-related adverse trends and impacts observed; crawl processes seem now close to overlapping; no improvement in crawl freshness.

10-minute Incremental Crawls

In this test, the crawl interval was reduced to 10 minutes, and then monitored over the course of a workday.  Interestingly, rather than improving indexed content freshness, there seemed to be a slight degradation: while aggregate freshness remained the same, freshness distribution degraded somewhat.

 Summary  Distribution by Freshness
Content Source
Aggregate Freshness
# Documents
<10m
<30m <1h <4h <12h <1D <2D <3D >3D
Local SharePoint sites
<30m
244
78%
93%
93% 93% 94% 100% 100% 100% 100%

Analysis: no real improvement was found from reducing the incremental crawl interval to 10 minutes.

Continuous Crawl

In this test, the crawl interval was set to "continuous", and then monitored over a workday.

Category Chart Comment
Processes

Drilling deeper into the Processes chart, it was unexpectedly evident that the crawler interval was approximately 15 minutes and that the term "continuous" was less than accurate.  What the data suggested was that a "continuous crawl" was not actually continuous in the strict sense, and that "continuous crawls" were simply standard incremental crawls but where the initiation of the next crawl was not timer-dependent but perhaps dependent on completion of all of the processes associated with the previous incremental crawl.

No substantive improvement in aggregate freshness was found in comparison to the 10 minute crawl interval. 

 Summary  Distribution by Freshness
Content Source
Aggregate Freshness
# Documents
<10m
<30m <1h <4h <12h <1D <2D <3D >3D
Local SharePoint sites
<30m
44
86%
100%
100% 100% 100% 100% 100% 100% 100%

What was found from this test was that the effective outcome of setting the incremental crawl interval to "continuous" was the same as setting the incremental crawl interval to 15 minutes.

Analysis

Plotting Aggregate Freshness over these various crawl interval types obtained the following chart

Chart Comment

tbd

Summary

Decreasing the crawl interval to less than 30 minutes led to no discernible improvement in crawl freshness. Implementing continuous crawl also did not improve crawl freshness - an unexpected outcome. Given the results of testing, setting the crawl interval to 30 minutes accomplishes the best crawl freshness performance reasonably achievable for this particular SharePoint 2013 system.

References

Notes

  • tbd