Share via


SharePoint 2013: Capacity Planning, Sizing and High Availability for Search in SPC172

Highlights:

One Search Core:  On-Premises, Office 365 and Exchange 2013
One Installer, One Farm:  multi-tenant as well
Search is a core service:  Building block for ECM, WCM/Internet Business, Productivity Search and Social
Flexible deployment with robust fault tolerance
Major overhaul of the UI
Much easier to configure

Three dimensions in search scaling.  SharePoint 2013 allows independent scaling for:

  •    Content Volume
  •    Query Load
  •    Crawl Load

Query Processing Component(QPC):

CPU Load:  Driving Factors
   QPS
   Query transformations
   Note:  Guideline:  4QPS per CPU core

Network Load:  Driving Factors
   Number of index partitions
   Size of queries and results
   Note:  Example:  20 index partitions @ 20 QPS => 200/100 Mbit/s in/outbound

Index Component:

CPU Load:  Driving Factors
   QPS and Item count
   Note:  Guidelines per index component @ 2GHz CPU
       1M items:   5 QPS per CPU core
       5M items:   2 QPS per CPU core
       10M items: 1 QPS per CPU core

Index Disk IOPS recommendations:

Crawl Load: 
  Typical:  10-60 IOPS @32-512KB writes

Query Load @ 10M items:
  During high crawl rate:  ~30x reads per query
  Without crawl: ~3x reads per query (caching)

Index Merge:
  Concurrent 150MB/s read + 150MB/s write

Crawl Component:

CPU Load:  Driving Factors
  Documents per second
  Link discovery
  Crawl management

Network Load:  Driving Factors
  Downloading items from content sources
  Passing items on to CPC

Disk Load:  Driving Factors
  All documents are temporarily stored in data folder

Content Processing Component(CPC):  

CPU Load:  Driving Factors
  Documents per second
  Document size and complexity
  Feature extraction
  Estimate:  5-10 DPS per CPU core

Network Load:  Driving Factors
  Documents per second
  Document size

Analytics Processing Component(APC):

CPU Load:  Driving Factors
  Number of items
  Site activity

Network Load:  Driving Factors
  Same as for CPU load
  Plus:  Network traffic increases when distributing APC across multiple machines

Disk Load:  Driving Factors
  Local disk used for temporary storage
  Bulk load, primary concern is load isolation

Search Administration Component:

Low CPU and network load
Load increases with more components in the search topology

Components:  Key takeaways:

Split bulk processing from query traffic:  

   Bulk:  Crawl, analytics, content processing
   Query traffic:  index and query processing

Two options for scaling:

   Scale up with more/faster hardware resources
   Scale out with more components across multiple machines

Avoid sharing critical resources: 

   Index is disk intensive and crucial in all load scenarios.
   Consider shared load on network, disk and CPU:
      Within a VM
      Between VM's on same physical host

**Small Search Topology Notes:
**
Windows Server 2012 can host all search components in one VM
The same applies for physical deployment on Windows Server 2008 R2
Windows Server 2008 Hyper-V supports maximum 4 CPU cores per VM

High Availability for Search:

Content Side  High Availability:  Full redundancy in the content feeding chain
Query Side High Availability:      Full redundancy of all query components
Disaster Recovery Options:         Hot, Warm or Cold.  Backup/Restore is now best practices.

Fault-Tolerance:

Indexing Fault-Tolerance:                  Journal Sync
Query Processing Fault-Tolerance:    round robin load balancing:  "lowest load" load balancing, "sticky load" load balancing
Admin Fault-Tolerance:                     Lease(expired?)
Database Fault-Tolerance:                 Database and index files must be in sync.  Supported:  synchronous mirroring. Not Supported:  asynchronous modes and log-shipping.

Backup and Restore:

Index designed for robust backup/restore
Everything but the index is in the database
"Point in time" backup:  No query down time
Backup/restore can make disaster recovery easier

Restore notes:

Restore the whole farm from a backup
   Restore only the SSA:  Entire topology must be restored.  Also, can replace existing topology

Only a single node failure?:
   Add a new node to the farm
   Add the missing SSA components to that node via the topology CMDlets
   Remove the components of the dead node from the topology

NOTE:  Don't forget to recreate your Search Service Application Proxy!!!  Search will not work otherwise.

Disaster Recovery made easy:

Primary and DR should be as similar as possible:  Farm layout, hostnames, database version, directory locations
Test your recovery procedures:  don't want for the failure!