Semantic Search (SQL Server)
Statistical Semantic Search provides deep insight into unstructured documents stored in SQL Server databases by extracting and indexing statistically relevant key phrases. Then it also uses these key phrases to identify and index documents that are similar or related.
You query these semantic indexes by using three Transact-SQL rowset functions to retrieve the results as structured data.
What Can I Do with Semantic Search?
Semantic search builds upon the existing full-text search feature in SQL Server, but enables new scenarios that extend beyond keyword searches. While full-text search lets you query the words in a document, semantic search lets you query the meaning of the document. Solutions that are now possible include automatic tag extraction, related content discovery, and hierarchical navigation across similar content. For example, you can query the index of key phrases to build the taxonomy for an organization, or for a corpus of documents. Or, you can query the document similarity index to identify resumes that match a job description.
The following examples demonstrate the capabilities of Semantic Search.
Find the Key Phrases in a Document
The following query gets the key phrases that were identified in the sample document. It presents the results in descending order by the score that ranks the statistical significance of each key phrase. This query calls the semantickeyphrasetable (Transact-SQL) function.
SET @Title = 'Sample Document.docx'
SELECT @DocID = DocumentID
FROM Documents
WHERE DocumentTitle = @Title
SELECT @Title AS Title, keyphrase, score
FROM SEMANTICKEYPHRASETABLE(Documents, *, @DocID)
ORDER BY score DESC
Find Similar or Related Documents
The following query gets the documents that were identified as similar or related to the sample document. It presents the results in descending order by the score that ranks the similarity of the 2 documents. This query calls the semanticsimilaritytable (Transact-SQL) function.
SET @Title = 'Sample Document.docx'
SELECT @DocID = DocumentID
FROM Documents
WHERE DocumentTitle = @Title
SELECT @Title AS SourceTitle, DocumentTitle AS MatchedTitle,
DocumentID, score
FROM SEMANTICSIMILARITYTABLE(Documents, *, @DocID)
INNER JOIN Documents ON DocumentID = matched_document_key
ORDER BY score DESC
Find the Key Phrases That Make Documents Similar or Related
The following query gets the key phrases that make the 2 sample documents similar or related to one another. It presents the results in descending order by the score that ranks the weight of each key phrase. This query calls the semanticsimilaritydetailstable (Transact-SQL) function.
SET @SourceTitle = 'first.docx'
SET @MatchedTitle = 'second.docx'
SELECT @SourceDocID = DocumentID FROM Documents WHERE DocumentTitle = @SourceTitle
SELECT @MatchedDocID = DocumentID FROM Documents WHERE DocumentTitle = @MatchedTitle
SELECT @SourceTitle AS SourceTitle, @MatchedTitle AS MatchedTitle, keyphrase, score
FROM semanticsimilaritydetailstable(Documents, DocumentContent,
@SourceDocID, DocumentContent, @MatchedDocID)
ORDER BY score DESC
Storing Documents in SQL Server
Before you can index documents with Semantic Search, you have to store the documents in a SQL Server database.
The FileTable feature in SQL Server 2014 makes unstructured files and documents first-class citizens of the relational database. As a result, database developers can manipulate documents together with structured data in Transact-SQL set-based operations.
For more information about the FileTable feature, see FileTables (SQL Server). For information about the FILESTREAM feature, which is another option for storing documents in the database, see FILESTREAM (SQL Server).
Related Tasks
Install and Configure Semantic Search
Describes the prerequisites for statistical semantic search and how to install or check them.
Enable Semantic Search on Tables and Columns
Describes how to enable or disable statistical semantic indexing on selected columns that contain documents or text.
Find Key Phrases in Documents with Semantic Search
Describes how to find the key phrases in documents or text columns that are configured for statistical semantic indexing.
Find Similar and Related Documents with Semantic Search
Describes how to find similar or related documents or text values, and information about how they are similar or related, in columns that are configured for statistical semantic indexing.
Manage and Monitor Semantic Search
Describes the process of semantic indexing and the tasks related to monitoring and managing the indexes.
Related Content
Semantic Search DDL, Functions, Stored Procedures, and Views
Lists the Transact-SQL statements and the SQL Server database objects added or changed to support statistical semantic search.