Manage crawl rules (FAST Search database connector)
Applies to: FAST Search Server 2010
Specify which content should be indexed
The FAST Search database connector crawls content by following SQL select statements. These SQL select statements can be specified either in the JDBCSQL parameter or in a file that you point to by specifying the JDBCSQLFile parameter. Specify one of these parameters in the Input group of the connector configuration file. All selected columns will be indexed by default.
To manage which content should be indexed, select specific columns in the select statements. For example:
SELECT Product.rowguid ID, Product.name Name, Product.weight weight
FROM Product
Specify which content should be transformed
More advanced result sets may require more advanced result set handling. The following parameters from the configuration file can be used to indicate what content should be transformed, and how.
Filter category | Parameter | Short description |
---|---|---|
Transformation |
UseMultiRowNormalizer |
If set to true, values for columns of records with identical identifiers are merged. In order for this feature to work, the SELECT statement must have an ORDER BY clause on the primary key/identifier field. |
Transformation |
JDBCMultiValueIncludeFields |
Semicolon (;) separated list of column names that should be concatenated in the output. This field overrides any settings to JDBCMultiValueSkipFields.. Do not use this setting together with JDBCAttributeColumn and JDBCValueColumn |
Transformation |
JDBCMultiValueSkipFields |
Semicolon (;) separated list of column names that have the same values in each row, and that should not be repeated in the output. Do not use this setting together with JDBCMultiValueIncludeFields. |
Transformation |
JDBCAttributeColumn |
For normalized tables, specifies the column that contains attribute names. Do not use this setting together with JDBCMultiValueIncludeFields and JDBCMultiValueSkipFields. |
Transformation |
JDBCValueColumn |
For normalized tables, specifies the column that contains attribute values. Do not use this setting together with JDBCMultiValueIncludeFields and JDBCMultiValueSkipFields. |
How to use the multi row normalizer
When joining tables, item IDs may be duplicated over several rows. This leads to several rows containing the same identifier. Use the multi row normalizer to collapse these rows into one item.
When you set the parameter UseMultiRowNormalizer to true, text from multiple rows with the same JDBCPrimaryKeyField value are concatenated. If a column contains a non-value, like NULL for example, this sometimes causes the string “NULL” to appear in the result set. To prevent this, use the isnull function when you use SQL Server (or a similar function for a different database type) in the parameter JDBCSQL. For example: SELECT ID, intro, body, isnull( comments, ‘’ ) comments FROM cms.articles;
How to use the parameters JDBCAttributeColumn and JDBCValueColumn
Items extracted from a database may have the desired item attribute names in one column, and the column values in another column. Use these two parameters to specify which column contains the attribute names, and which column contains the attribute value.
How to connect to the source database
Configuring the FAST Search database connector to connect to the source database requires a JDBC connection string to be created for the source database. Different databases use different connection strings. Specify the connection string to the source database in the parameter JDBCURL.
Examples of JDBC connection strings for common databases:
Microsoft Microsoft SQL Server 2000:
jdbc:microsoft:sqlserver://<YourServerHere>:1433;DatabaseName=<YourDatabaseNameHere>
Where:
<YourServerHere> is the SQL Server host name
<YourDatabaseNameHere> is the name of the source database that you are connecting to
Microsoft Microsoft SQL Server 2005 and later versions
jdbc:sqlserver://<YourServerHere>:1433;database=<YourDatabaseNameHere>
Where:
<YourServerHere> is the SQL Server host name
<YourDatabaseNameHere> is the name of the source database that you are connecting to
IBM DB2
jdbc:db2://<YourServerHere>:50000/<YourDatabaseNameHere>
Where:
<YourServerHere> is the IBM DB2 server host name
<YourDatabaseNameHere> is the name of the source database that you are connecting to
Oracle
jdbc:oracle:thin:@<YourServerHere>:1521:<YourDataBaseNameHere>
Where:
<YourServerHere> is the Oracle server host name
<YourDatabaseNameHere> is the name of the source database that you are connecting to
Connecting to databases from providers other than Microsoft also requires the JDBC driver from the specific provider to be put in the /lib directory of your FAST Search Server 2010 for SharePoint installation folder. Refer to the database provider to find out how to obtain the JDBC driver.