SharePoint 2013: Capacity Planning, Sizing and High Availability for Search in SPC172

Article
2024-01-17

Highlights:

One Search Core: On-Premises, Office 365 and Exchange 2013
One Installer, One Farm: multi-tenant as well
Search is a core service: Building block for ECM, WCM/Internet Business, Productivity Search and Social
Flexible deployment with robust fault tolerance
Major overhaul of the UI
Much easier to configure

Three dimensions in search scaling. SharePoint 2013 allows independent scaling for:

Content Volume
Query Load
Crawl Load

Query Processing Component(QPC):

CPU Load: Driving Factors
   QPS
   Query transformations
   Note: Guideline: 4QPS per CPU core

Network Load: Driving Factors
   Number of index partitions
   Size of queries and results
   Note: Example: 20 index partitions @ 20 QPS => 200/100 Mbit/s in/outbound

Index Component:

CPU Load: Driving Factors
   QPS and Item count
   Note: Guidelines per index component @ 2GHz CPU
       1M items:   5 QPS per CPU core
       5M items:   2 QPS per CPU core
       10M items: 1 QPS per CPU core

Index Disk IOPS recommendations:

Crawl Load:
Typical: 10-60 IOPS @32-512KB writes

Query Load @ 10M items:
During high crawl rate: ~30x reads per query
Without crawl: ~3x reads per query (caching)

Index Merge:
Concurrent 150MB/s read + 150MB/s write

Crawl Component:

CPU Load: Driving Factors
Documents per second
Link discovery
Crawl management

Network Load: Driving Factors
Downloading items from content sources
Passing items on to CPC

Disk Load: Driving Factors
All documents are temporarily stored in data folder

Content Processing Component(CPC):

CPU Load: Driving Factors
Documents per second
Document size and complexity
Feature extraction
Estimate: 5-10 DPS per CPU core

Network Load: Driving Factors
Documents per second
Document size

Analytics Processing Component(APC):

CPU Load: Driving Factors
Number of items
Site activity

Network Load: Driving Factors
Same as for CPU load
Plus: Network traffic increases when distributing APC across multiple machines

Disk Load: Driving Factors
Local disk used for temporary storage
Bulk load, primary concern is load isolation

Search Administration Component:

Low CPU and network load
Load increases with more components in the search topology

Components: Key takeaways:

Split bulk processing from query traffic:

Bulk: Crawl, analytics, content processing
Query traffic: index and query processing

Two options for scaling:

Scale up with more/faster hardware resources
Scale out with more components across multiple machines

Avoid sharing critical resources:

   Index is disk intensive and crucial in all load scenarios.
   Consider shared load on network, disk and CPU:
      Within a VM
      Between VM's on same physical host

**Small Search Topology Notes:
**
Windows Server 2012 can host all search components in one VM
The same applies for physical deployment on Windows Server 2008 R2
Windows Server 2008 Hyper-V supports maximum 4 CPU cores per VM

High Availability for Search:

Content Side  High Availability: Full redundancy in the content feeding chain
Query Side High Availability:      Full redundancy of all query components
Disaster Recovery Options:         Hot, Warm or Cold. Backup/Restore is now best practices.

Fault-Tolerance:

Indexing Fault-Tolerance:                  Journal Sync
Query Processing Fault-Tolerance:    round robin load balancing: "lowest load" load balancing, "sticky load" load balancing
Admin Fault-Tolerance:                     Lease(expired?)
Database Fault-Tolerance:                 Database and index files must be in sync. Supported: synchronous mirroring. Not Supported: asynchronous modes and log-shipping.

Backup and Restore:

Index designed for robust backup/restore
Everything but the index is in the database
"Point in time" backup: No query down time
Backup/restore can make disaster recovery easier

Restore notes:

Restore the whole farm from a backup
Restore only the SSA: Entire topology must be restored. Also, can replace existing topology

Only a single node failure?:
   Add a new node to the farm
   Add the missing SSA components to that node via the topology CMDlets
   Remove the components of the dead node from the topology

NOTE: Don't forget to recreate your Search Service Application Proxy!!! Search will not work otherwise.

Disaster Recovery made easy:

Primary and DR should be as similar as possible: Farm layout, hostnames, database version, directory locations
Test your recovery procedures: don't want for the failure!

Share via

SharePoint 2013: Capacity Planning, Sizing and High Availability for Search in SPC172

Additional resources