Test results: Extra-large scenario (FAST Search Server 2010 for SharePoint)
Applies to: FAST Search Server 2010
With the extra-large Microsoft FAST Search Server 2010 for SharePoint test scenario, we targeted an extra-large test corpus. For this scenario, redundancy is crucial, because with this many computers, hardware failures are likely to occur. The content volume for the scenario was up to 500 million items. The content volume means that crawls will have to run during business hours.
We set up the parent Microsoft SharePoint Server 2010 farm with two front-end web servers, four application servers, and two database servers, and arranged them as follows:
We used the SharePoint Server 2010 crawler, indexing connector framework, and the FAST Content Search Service Application (Content SSA) to crawl content. We distributed four crawl components for the Content SSA across the four application servers. This accommodates I/O limitations in the test setup (1 gigabit per second network), where four nodes provide a theoretically maximum crawl rate of 4 gigabits per second.
One of the application servers also hosted Central Administration for the farm.
One database server hosted the crawl databases.
One database server hosted the FAST Search Server 2010 for SharePoint administration databases and the other SharePoint Server 2010 databases.
We did not use any separate data storage, because the application servers and front-end web servers only needed space for operating system, application binaries and log files.
In this article:
Test deployment
Test characteristics
Test results
Test deployment
Within the extra-large scenario, we tested one FAST Search Server 2010 for SharePoint deployment, with service pack 1 (SP1):
Name |
Description |
XL1 |
Two rows, twelve columns setup, with one additional administration node (25 servers in total). The extended capacity mode can support up to 40 million items per server. |
Test characteristics
This section provides detailed information about the hardware, software, topology and configuration of the test environment.
Hardware/Software
We tested the specified deployment by using the following hardware and software, although the administration node does not need this much disk space.
FAST Search Server 2010 for SharePoint servers
Windows Server 2008 R2 x64 Enterprise Edition
2x Intel L5640 CPUs with Hyper-threading and Turbo Boost switched on
48 GB memory
1 gigabits per second network card
Storage subsystem:
OS: 2x 146GB 10k RPM 2.5” SAS disks in RAID1
Application: 12x 1 terabyte 7200 RPM 6 gigabits per second 3.5” SAS disks in RAID10. Total formatted capacity of 5.5 terabytes.
Disk controller: Dell PERC H700 Integrated, firmware 12.10.0-0025
Disks: Seagate Constellation ES ST31000424SS, firmware KS68
SharePoint Server 2010 servers
Windows Server 2008 R2 x64 Enterprise edition
2x Intel L5420 CPUs
16 GB memory
1 gigabit per second network card
Storage subsystem for OS/Programs: 2x 146GB 10k RPM SAS disks in RAID1
SQL servers
General SQL server: Same specification as for SharePoint Server 2010 servers, but with additional disk RAID for SQL data with 6x 146GB 10k RPM SAS disks in RAID5.
Crawl database SQL server:
Windows Server 2008 R2 x64 Enterprise edition
2x Intel X5670 CPUs with Hyper-threading and Turbo Boost switched on
48 GB memory
1 gigabit per second network card
Storage subsystem:
OS: 2x 146GB 10k RPM 2.5” SAS disks in RAID1
Application: 12x SAS 15k RPM 600GB 3.5" SAS disks in RAID50. Total formatted capacity 5.6 terabytes.
Topology
This section describes the topology of all the test deployments.
XL1
XL1 is a setup with two rows and twelve columns, with an additional administration node. The second row adds query throughput capacity, query redundancy, and gives better separation of query and feeding load. The deployment includes three query processing components across the administration node (fsadmin.contoso.com), and two query processing components in the second row (fsr1coo.contoso.com and fsr1c01.contoso.com). The Query SSA is configured to only use the query processing components on the search row during normal operation. The query component on the administration node may be reconfigured for use as a fallback. It can then serve queries in conjunction with the first search row if the second search row is taken down for maintenance.
The following figure shows the XL deployment.
We used the following deployment.xml file to set up XL1.
<?xml version="1.0" encoding="utf-8" ?>
<deployment version="14" modifiedBy="
<deployment version="14"
modifiedBy="contoso\user" modifiedTime="2011-01-01T12:00:00+00:00" comment="XL1"
xmlns="https://www.microsoft.com/enterprisesearch"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://www.microsoft.com/enterprisesearch deployment.xsd">
<instanceid>XL1</instanceid>
<connector-databaseconnectionstring>[<![CDATA[jdbc:sqlserver://sqlbox.contoso.com\sql:1433;DatabaseName=XL1.jdbc]]></connector-databaseconnectionstring>
<!-- Admin -->
<host name="fsadmin.contoso.com">
<admin />
<query />
<content-distributor />
<indexing-dispatcher />
<document-processor processes="8" />
</host>
<!-- Row 0 -->
<host name="fsr0c00.contoso.com">
<content-distributor />
<searchengine row="0" column="0" />
<document-processor processes="12" />
</host>
<host name="fsr0c01.contoso.com">
<content-distributor />
<searchengine row="0" column="1" />
<document-processor processes="12" />
</host>
<host name="fsr0c02.contoso.com">
<content-distributor />
<searchengine row="0" column="2" />
<document-processor processes="12" />
</host>
<host name="fsr0c03.contoso.com">
<searchengine row="0" column="3" />
<document-processor processes="12" />
<webanalyzer server="true" link-processing="true" lookup-db="true" max-targets="2" redundant-lookup="true" />
</host>
<host name="fsr0c04.contoso.com">
<searchengine row="0" column="4" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<host name="fsr0c05.contoso.com">
<searchengine row="0" column="5" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<host name="fsr0c06.contoso.com">
<searchengine row="0" column="6" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<host name="fsr0c07.contoso.com">
<searchengine row="0" column="7" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<host name="fsr0c08.contoso.com">
<searchengine row="0" column="8" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<host name="fsr0c09.contoso.com">
<indexing-dispatcher />
<searchengine row="0" column="9" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<host name="fsr0c10.contoso.com">
<indexing-dispatcher />
<searchengine row="0" column="10" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<host name="fsr0c11.contoso.com">
<indexing-dispatcher />
<searchengine row="0" column="11" />
<document-processor processes="12" />
<webanalyzer server="false" link-processing="true" lookup-db="true" max-targets="2"/>
</host>
<!-- Row 1 -->
<host name="fsr1c00.contoso.com">
<query />
<searchengine row="1" column="0" />
<document-processor processes="8" />
</host>
<host name="fsr1c01contoso.com">
<query />
<searchengine row="1" column="1" />
<document-processor processes="8" />
</host>
<host name="fsr1c02.contoso.com">
<searchengine row="1" column="2" />
<document-processor processes="8" />
</host>
<host name="fsr1c03.contoso.com">
<searchengine row="1" column="3" />
<document-processor processes="8" />
</host>
<host name="fsr1c04.contoso.com">
<searchengine row="1" column="4" />
<document-processor processes="8" />
</host>
<host name="fsr1c05.contoso.com">
<searchengine row="1" column="5" />
<document-processor processes="8" />
</host>
<host name="fsr1c06.contoso.com">
<searchengine row="1" column="6" />
<document-processor processes="8" />
</host>
<host name="fsr1c07.contoso.com">
<searchengine row="1" column="7" />
<document-processor processes="8" />
</host>
<host name="fsr1c08.contoso.com">
<searchengine row="1" column="8" />
<document-processor processes="8" />
</host>
<host name="fsr1c09.contoso.com">
<searchengine row="1" column="9" />
<document-processor processes="8" />
</host>
<host name="fsr1c10.contoso.com">
<searchengine row="1" column="10" />
<document-processor processes="8" />
</host>
<host name="fsr1c011.contoso.com">
<searchengine row="1" column="11" />
<document-processor processes="8" />
</host>
<searchcluster>
<row id="0" index="primary" search="true" />
<row id="1" index="secondary" search="true" />
</searchcluster>
</deployment>
Dataset
This section describes the test farm dataset: The database content and sizes, search indexes, and external data sources. The Content SSA was configured to use 12 crawl databases.
The following table shows the overall metrics.
Object | Value |
---|---|
Search index size |
503 million items |
Size of crawl database |
2.8 terabytes |
Size of crawl database log file |
1.6 terabytes |
Size of property database |
0.1 terabytes |
Size of property database log file |
2 gigabytes (GB) |
Size of SSA administration database |
35 gigabytes (GB) |
The next table shows the content source types we used to build the index. The numbers reflect the total number of items per source and include replicated copies. The difference between the total number of items and the index size can have two reasons:
Items may have been disabled from indexing in the content source, or
The document format type could not be indexed.
For SharePoint sources, the size of the respective content database in SQL represents the raw data size.
Content source | Items | Raw data size | Average size per item |
---|---|---|---|
File share 1 (12 copies) |
7.2 million |
924 gigabytes (GB) |
128 kilobytes (KB) |
File share 2 (12 copies) |
176 million |
40 terabytes |
229 kilobytes (KB) |
SharePoint 1 (12 copies) |
54 million |
24 terabytes |
443 kilobytes (KB) |
SharePoint 2 (12 copies) |
54 million |
24 terabytes |
443 kilobytes (KB) |
SharePoint 3 (12 copies) |
54 million |
24 terabytes |
443 kilobytes (KB) |
HTML 1 (12 copies) |
13 million |
105 gigabytes (GB) |
8.1 kilobytes (KB) |
HTML 2 (12 copies) |
38 million |
1.7 TB terabytes |
43 kilobytes (KB) |
HTML 3 (12 copies) |
122 million |
5.9 terabytes |
49 kilobytes (KB) |
Total |
518 million |
121 terabytes |
233 kilobytes (KB) |
To reach sufficient content volume in the testing of the large scenario, we added replicas of the data sources. Each copy of each document appeared as a unique item in the index, but they were treated as duplicates by the duplicate trimming feature. From a query matching perspective, the load was similar to having all unique documents indexed, but any results from these sources triggered duplicate detection and collapsing in the search results.
Note
The extra-large test scenario did not include people search data.
Test results
This section describes how the various deployments performed under load: Crawling and indexing performance, query performance and disk usage.
Crawling and indexing performance
The extra-large scenario deployment was limited by the bandwidth of the content sources. Crawl rates were averaging around 250 items per second. This would have corresponded to 25 days of constant crawling. In testing, crawls were spilt up in blocks of around 43 million items, allowing for intermediate query testing at less than full capacity. Crawl pauses were also needed for maintenance periods, for example Windows Update and replacement of failed hardware.
Query performance
The graph below shows the query performance under various conditions. At full capacity of 500 million items, and with no ongoing indexing, the farm was able to sustain about 15 queries per second (QPS) with less than 1 second average latency before being limited by the CPU resources on the search row. During incremental crawls and full crawls, this number dropped to 12.5 and 10 QPS respectively. Additional testing was done before reaching full capacity. The red lines in the graph are measured when the system was half full. Without any feed, 27 QPS is possible. Under feed, there are fewer differences between the 250 million and 500 million item states. This is expected, because incoming feeds require certain amounts of network, CPU and disk capacity regardless of how many items that are already in the index.
The next graph shows the query latency during a computer failure, taking one of the columns in the search row offline. Query load is around 10 QPS, with latency plotted as a function of time. The primary row will automatically provide failover query matching for the failed column, but with a performance impact since it also needs to serve ongoing indexing. During this degraded period, query latency is on average around 1.4 seconds, recovering to around 1 second halfway through the plot when the failed computer is brought back online.
Disk usage
The following table shows the combined disk usage on all nodes in the deployment.
Content source | FiXML data size | Index data size | Other data* size | Total disk usage |
---|---|---|---|---|
Administration node |
0 |
0 |
33 gigabytes (GB) |
33 gigabytes (GB) |
Row 0 (across 12 computers) |
6.2 terabytes |
15.8 terabytes |
40 gigabytes (GB) |
22.0 terabytes |
Row 1 (across 12 computers) |
6.4 terabytes |
15.8 terabytes |
40 gigabytes (GB) |
22.2 terabytesTB |
Total |
12.6 terabytes |
31.6 terabytes |
113 gigabytes (GB) |
44.2 terabytes |
* Logs and intermediate data for handling fault recovery on computer failures