SharePoint 2013: Capacity Planning, Sizing and High Availability for Search in SPC172
Highlights:
One Search Core: On-Premises, Office 365 and Exchange 2013
One Installer, One Farm: multi-tenant as well
Search is a core service: Building block for ECM, WCM/Internet Business, Productivity Search and Social
Flexible deployment with robust fault tolerance
Major overhaul of the UI
Much easier to configure
Three dimensions in search scaling. SharePoint 2013 allows independent scaling for:
- Content Volume
- Query Load
- Crawl Load
Query Processing Component(QPC):
CPU Load: Driving Factors
QPS
Query transformations
Note: Guideline: 4QPS per CPU core
Network Load: Driving Factors
Number of index partitions
Size of queries and results
Note: Example: 20 index partitions @ 20 QPS => 200/100 Mbit/s in/outbound
Index Component:
CPU Load: Driving Factors
QPS and Item count
Note: Guidelines per index component @ 2GHz CPU
1M items: 5 QPS per CPU core
5M items: 2 QPS per CPU core
10M items: 1 QPS per CPU core
Index Disk IOPS recommendations:
Crawl Load:
Typical: 10-60 IOPS @32-512KB writes
Query Load @ 10M items:
During high crawl rate: ~30x reads per query
Without crawl: ~3x reads per query (caching)
Index Merge:
Concurrent 150MB/s read + 150MB/s write
Crawl Component:
CPU Load: Driving Factors
Documents per second
Link discovery
Crawl management
Network Load: Driving Factors
Downloading items from content sources
Passing items on to CPC
Disk Load: Driving Factors
All documents are temporarily stored in data folder
Content Processing Component(CPC):
CPU Load: Driving Factors
Documents per second
Document size and complexity
Feature extraction
Estimate: 5-10 DPS per CPU core
Network Load: Driving Factors
Documents per second
Document size
Analytics Processing Component(APC):
CPU Load: Driving Factors
Number of items
Site activity
Network Load: Driving Factors
Same as for CPU load
Plus: Network traffic increases when distributing APC across multiple machines
Disk Load: Driving Factors
Local disk used for temporary storage
Bulk load, primary concern is load isolation
Search Administration Component:
Low CPU and network load
Load increases with more components in the search topology
Components: Key takeaways:
Split bulk processing from query traffic:
Bulk: Crawl, analytics, content processing
Query traffic: index and query processing
Two options for scaling:
Scale up with more/faster hardware resources
Scale out with more components across multiple machines
Avoid sharing critical resources:
Index is disk intensive and crucial in all load scenarios.
Consider shared load on network, disk and CPU:
Within a VM
Between VM's on same physical host
**Small Search Topology Notes:
**
Windows Server 2012 can host all search components in one VM
The same applies for physical deployment on Windows Server 2008 R2
Windows Server 2008 Hyper-V supports maximum 4 CPU cores per VM
High Availability for Search:
Content Side High Availability: Full redundancy in the content feeding chain
Query Side High Availability: Full redundancy of all query components
Disaster Recovery Options: Hot, Warm or Cold. Backup/Restore is now best practices.
Fault-Tolerance:
Indexing Fault-Tolerance: Journal Sync
Query Processing Fault-Tolerance: round robin load balancing: "lowest load" load balancing, "sticky load" load balancing
Admin Fault-Tolerance: Lease(expired?)
Database Fault-Tolerance: Database and index files must be in sync. Supported: synchronous mirroring. Not Supported: asynchronous modes and log-shipping.
Backup and Restore:
Index designed for robust backup/restore
Everything but the index is in the database
"Point in time" backup: No query down time
Backup/restore can make disaster recovery easier
Restore notes:
Restore the whole farm from a backup
Restore only the SSA: Entire topology must be restored. Also, can replace existing topology
Only a single node failure?:
Add a new node to the farm
Add the missing SSA components to that node via the topology CMDlets
Remove the components of the dead node from the topology
NOTE: Don't forget to recreate your Search Service Application Proxy!!! Search will not work otherwise.
Disaster Recovery made easy:
Primary and DR should be as similar as possible: Farm layout, hostnames, database version, directory locations
Test your recovery procedures: don't want for the failure!