Dela via


com.azure.search.documents

Azure AI Search, formerly known as "Azure AI Search", provides secure information retrieval at scale over user-owned content in traditional and conversational search applications.

The Azure AI Search service provides:/p>

  • A search engine for vector search, full text, and hybrid search over a search index.
  • Rich indexing with integrated data chunking and vectorization (preview), lexical analysis for text, and optional AI enrichment for content extraction and transformation.
  • Rich query syntax for vector queries, text search, hybrid queries, fuzzy search, autocomplete, geo-search and others.
  • Azure scale, security, and reach.
  • Azure integration at the data layer, machine learning layer, Azure AI services and Azure OpenAI

The Azure AI Search service is well suited for the following application scenarios:

  • Consolidate varied content types into a single searchable index. To populate an index, you can push JSON documents that contain your content, or if your data is already in Azure, create an indexer to pull in data automatically.
  • Attach skillsets to an indexer to create searchable content from images and large text documents. A skillset leverages AI from Cognitive Services for built-in OCR, entity recognition, key phrase extraction, language detection, text translation, and sentiment analysis. You can also add custom skills to integrate external processing of your content during data ingestion.
  • In a search client application, implement query logic and user experiences similar to commercial web search engines.

This is the Java client library for Azure AI Search. Azure AI Search service is a search-as-a-service cloud solution that gives developers APIs and tools for adding a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.

The Azure Search Documents client library allows for Java developers to easily interact with the Azure AI Search service from their Java applications. This library provides a set of APIs that abstract the low-level details of working with the Azure AI Search service and allows developers to perform common operations such as:

  • Submit queries for simple and advanced query forms that include fuzzy search, wildcard search, regular expressions..
  • Implement filtered queries for faceted navigation, geospatial search, or to narrow results based on filter criteria.
  • Create and manage search indexes.
  • Upload and update documents in the search index.
  • Create and manage indexers that pull data from Azure into an index.
  • Create and manage skillsets that add AI enrichment to data ingestion.
  • Create and manage analyzers for advanced text analysis or multi-lingual content.
  • Optimize results through scoring profiles to factor in business logic or freshness.

Getting Started

Prerequisites

The client library package requires the following:

To create a new Search service, you can use the Azure portal, Azure Powershell, or the Azure CLI.

Authenticate the client

To interact with the Search service, you'll need to create an instance of the appropriate client class: SearchClient for searching indexed documents, SearchIndexClient for managing indexes, or SearchIndexerClient for crawling data sources and loading search documents into an index. To instantiate a client object, you'll need an endpoint and API key. You can refer to the documentation for more information on supported authenticating approaches with the Search service.

Get an API Key

You can get the endpoint and an API key from the Search service in the Azure Portal. Please refer the documentation for instructions on how to get an API key.

The SDK provides three clients.

  • SearchIndexClient for CRUD operations on indexes and synonym maps.
  • SearchIndexerClient for CRUD operations on indexers, data sources, and skillsets.
  • SearchClient for all document operations.

Create a SearchIndexClient

To create a SearchIndexClient, you will need the values of the Azure AI Search service URL endpoint and admin key. The following snippet shows how to create a SearchIndexClient.

The following sample creates a SearchIndexClient using the endpoint and Azure Key Credential (API Key).

SearchIndexClient searchIndexClient = new SearchIndexClientBuilder()
     .endpoint("{endpoint}")
     .credential(new AzureKeyCredential("{key}"))
     .buildClient();

Create a SearchIndexerClient

To create a SearchIndexerClient, you will need the values of the Azure AI Search service URL endpoint and admin key. The following snippet shows how to create a SearchIndexerClient.

The following sample creates SearchIndexerClient using an endpoint and Azure Key Credential (API Key).

SearchIndexerClient searchIndexerClient = new SearchIndexerClientBuilder()
     .endpoint("{endpoint}")
     .credential(new AzureKeyCredential("{key}"))
     .buildClient();

Create a SearchClient

To create a SearchClient, you will need the values of the Azure AI Search service URL endpoint, admin key, and an index name. The following snippet shows how to create a SearchIndexerClient.

The following sample creates a SearchClient

SearchClient searchClient = new SearchClientBuilder()
     .endpoint("{endpoint}")
     .credential(new AzureKeyCredential("{key}"))
     .indexName("{indexName}")
     .buildClient();

Key Concepts

An Azure AI Search service contains one or more indexes that provide persistent storage of searchable data in the form of JSON documents. (If you're new to search, you can make a very rough analogy between indexes and database tables.) The azure-search-documents client library exposes operations on these resources through two main client types.

SearchClient helps with:

SearchIndexClient allows you to:

  • Create, delete, update, or configure a search index
  • Declare custom synonym maps to expand or rewrite queries
  • Most of the SearchServiceClient functionality is not yet available in our current preview

SearchIndexerClient allows you to:

  • Start indexers to automatically crawl data sources
  • Define AI powered Skillsets to transform and enrich your data

Azure AI Search provides two powerful features:

Semantic search enhances the quality of search results for text-based queries. By enabling Semantic Search on your search service, you can improve the relevance of search results in two ways:

  • It applies secondary ranking to the initial result set, promoting the most semantically relevant results to the top.
  • It extracts and returns captions and answers in the response, which can be displayed on a search page to enhance the user's search experience.

To learn more about Semantic Search, you can refer to the documentation.

Vector Search is an information retrieval technique that overcomes the limitations of traditional keyword-based search. Instead of relying solely on lexical analysis and matching individual query terms, Vector Search utilizes machine learning models to capture the contextual meaning of words and phrases. It represents documents and queries as vectors in a high-dimensional space called an embedding. By understanding the intent behind the query, Vector Search can deliver more relevant results that align with the user's requirements, even if the exact terms are not present in the document. Moreover, Vector Search can be applied to various types of content, including images and videos, not just text.

To learn how to index vector fields and perform vector search, you can refer to the sample. This sample provides detailed guidance on indexing vector fields and demonstrates how to perform vector search.

Additionally, for more comprehensive information about Vector Search, including its concepts and usage, you can refer to the documentation. The documentation provides in-depth explanations and guidance on leveraging the power of Vector Search in Azure AI Search.

Examples

The following examples all use a sample Hotel data set that you can import into your own index from the Azure portal. These are just a few of the basics - please check out our Samples for much more.

Querying

There are two ways to interact with the data returned from a search query.

Use SearchDocument like a dictionary for search results

SearchDocument is the default type returned from queries when you don't provide your own. The following sample performs the search, enumerates over the results, and extracts data using SearchDocument's dictionary indexer.

for (SearchResult result : searchClient.search("luxury")) {
     SearchDocument document = result.getDocument(SearchDocument.class);
     System.out.printf("Hotel ID: %s%n", document.get("hotelId"));
     System.out.printf("Hotel Name: %s%n", document.get("hotelName"));
 }
Use Java model class for search results

Define a `Hotel` class.

public static class Hotel {
     private String hotelId;
     private String hotelName;

     @SimpleField(isKey = true)
     public String getHotelId() {
         return this.hotelId;
     }

     public String getHotelName() {
         return this.hotelName;
     }

     public Hotel setHotelId(String number) {
         this.hotelId = number;
         return this;
     }

     public Hotel setHotelName(String secretPointMotel) {
         this.hotelName = secretPointMotel;
         return this;
     }
 }

Use it in place of SearchDocument when querying.

for (SearchResult result : searchClient.search("luxury")) {
     Hotel hotel = result.getDocument(Hotel.class);
     System.out.printf("Hotel ID: %s%n", hotel.getHotelId());
     System.out.printf("Hotel Name: %s%n", hotel.getHotelName());
 }
Search Options

The SearchOptions provide powerful control over the behavior of our queries.

The following sample uses SearchOptions to search for the top 5 luxury hotel with a good rating (4 or above).

SearchOptions options = new SearchOptions()
     .setFilter("rating gt 4")
     .setOrderBy("rating desc")
     .setTop(5);
 SearchPagedIterable searchResultsIterable = searchClient.search("luxury", options, Context.NONE);
 searchResultsIterable.forEach(result -> {
     System.out.printf("Hotel ID: %s%n", result.getDocument(Hotel.class).getHotelId());
     System.out.printf("Hotel Name: %s%n", result.getDocument(Hotel.class).getHotelName());
 });

Creating an index

You can use the SearchIndexClient to create a search index. Indexes can also define suggesters, lexical analyzers, and more.

There are multiple ways of preparing search fields for a search index. For basic needs, there is a static helper method buildSearchFields in SearchIndexClient and SearchIndexAsyncClient. There are three annotations SimpleFieldProperty, SearchFieldProperty and FieldBuilderIgnore to configure the field of model class.

// Create a new search index structure that matches the properties of the Hotel class.
 List<SearchField> searchFields = SearchIndexClient.buildSearchFields(Hotel.class, null);
 searchIndexClient.createIndex(new SearchIndex("hotels", searchFields));

For advanced scenarios, you can build search fields using SearchField directly. The following sample shows how to build search fields with SearchField.

// Create a new search index structure that matches the properties of the Hotel class.
 List<SearchField> searchFieldList = new ArrayList<>();
 searchFieldList.add(new SearchField("hotelId", SearchFieldDataType.STRING)
         .setKey(true)
         .setFilterable(true)
         .setSortable(true));

 searchFieldList.add(new SearchField("hotelName", SearchFieldDataType.STRING)
         .setSearchable(true)
         .setFilterable(true)
         .setSortable(true));
 searchFieldList.add(new SearchField("description", SearchFieldDataType.STRING)
     .setSearchable(true)
     .setAnalyzerName(LexicalAnalyzerName.EU_LUCENE));
 searchFieldList.add(new SearchField("tags", SearchFieldDataType.collection(SearchFieldDataType.STRING))
     .setSearchable(true)
     .setFilterable(true)
     .setFacetable(true));
 searchFieldList.add(new SearchField("address", SearchFieldDataType.COMPLEX)
     .setFields(new SearchField("streetAddress", SearchFieldDataType.STRING).setSearchable(true),
         new SearchField("city", SearchFieldDataType.STRING)
             .setSearchable(true)
             .setFilterable(true)
             .setFacetable(true)
             .setSortable(true),
         new SearchField("stateProvince", SearchFieldDataType.STRING)
             .setSearchable(true)
             .setFilterable(true)
             .setFacetable(true)
             .setSortable(true),
         new SearchField("country", SearchFieldDataType.STRING)
             .setSearchable(true)
             .setFilterable(true)
             .setFacetable(true)
             .setSortable(true),
         new SearchField("postalCode", SearchFieldDataType.STRING)
             .setSearchable(true)
             .setFilterable(true)
             .setFacetable(true)
             .setSortable(true)
     ));

 // Prepare suggester.
 SearchSuggester suggester = new SearchSuggester("sg", Collections.singletonList("hotelName"));
 // Prepare SearchIndex with index name and search fields.
 SearchIndex index = new SearchIndex("hotels").setFields(searchFieldList).setSuggesters(suggester);
 // Create an index
 searchIndexClient.createIndex(index);

Retrieving a specific document from your index

In addition to querying for documents using keywords and optional filters, you can retrieve a specific document from your index if you already know the key.

The following example retrieves a document using the document's key.

Hotel hotel = searchClient.getDocument("1", Hotel.class);
 System.out.printf("Hotel ID: %s%n", hotel.getHotelId());
 System.out.printf("Hotel Name: %s%n", hotel.getHotelName());

Adding documents to your index

You can Upload, Merge, MergeOrUpload, and Delete multiple documents from an index in a single batched request. There are a few special rules for merging to be aware of.

The following sample shows using a single batch request to perform a document upload and merge in a single request.

IndexDocumentsBatch<Hotel> batch = new IndexDocumentsBatch<Hotel>();
 batch.addUploadActions(Collections.singletonList(
         new Hotel().setHotelId("783").setHotelName("Upload Inn")));
 batch.addMergeActions(Collections.singletonList(
         new Hotel().setHotelId("12").setHotelName("Renovated Ranch")));
 searchClient.indexDocuments(batch);

Async APIs

The examples so far have been using synchronous APIs. For asynchronous support and examples, please see our asynchronous clients:

  • SearchIndexAsyncClient
  • SearchIndexerAsyncClient
  • SearchAsyncClient

Authenticate in a National Cloud

To authenticate a National Cloud, you will need to make the following additions to your client configuration:

  • Set `AuthorityHost` in the credential potions or via the `AZURE_AUTHORITY_HOST` environment variable
  • Set the `audience` in SearchClientBuilder, SearchIndexClientBuilder, SearchIndexerClientBuilder
SearchClient searchClient = new SearchClientBuilder()
     .endpoint("{endpoint}")
     .credential(new DefaultAzureCredentialBuilder()
         .authorityHost("{national cloud endpoint}")
         .build())
     .audience(SearchAudience.AZURE_PUBLIC_CLOUD) //set the audience of your cloud
     .buildClient();

Troubleshooting

See our troubleshooting guide for details on how to diagnose various failure scenarios.

General

When you interact with Azure AI Search using this Java client library, errors returned by the service correspond to the same HTTP status codes returned for REST API requests. For example, the service will return a 404 error if you try to retrieve a document that doesn't exist in your index.

Handling Search Error Response

Any Search API operation that fails will throw an HttpResponseException with helpful Status codes. Many of these errors are recoverable.

try {
     Iterable<SearchResult> results = searchClient.search("hotel");
     results.forEach(result -> {
         System.out.println(result.getDocument(Hotel.class).getHotelName());
     });
 } catch (HttpResponseException ex) {
     // The exception contains the HTTP status code and the detailed message
     // returned from the search service
     HttpResponse response = ex.getResponse();
     System.out.println("Status Code: " + response.getStatusCode());
     System.out.println("Message: " + ex.getMessage());
 }

Classes

SearchAsyncClient

This class provides a client that contains the operations for querying an index and uploading, merging, or deleting documents in an Azure AI Search service.

SearchClient

This class provides a client that contains the operations for querying an index and uploading, merging, or deleting documents in an Azure AI Search service.

SearchClientBuilder

This class provides a fluent builder API to help aid the configuration and instantiation of SearchClient and SearchAsyncClient.

SearchClientBuilder.SearchIndexingBufferedSenderBuilder<T>

This class provides a fluent builder API to help aid the configuration and instantiation of SearchIndexingBufferedSender<T> and SearchIndexingBufferedAsyncSender<T>.

SearchDocument

Represents an untyped document returned from a search or document lookup.

SearchFilter

This class is used to help construct valid OData filter expressions by automatically replacing, quoting, and escaping string parameters.

SearchIndexingBufferedAsyncSender<T>

This class provides a buffered sender that contains operations for conveniently indexing documents to an Azure Search index.

SearchIndexingBufferedSender<T>

This class provides a buffered sender that contains operations for conveniently indexing documents to an Azure Search index.

Enums

SearchServiceVersion

The versions of Azure AI Search supported by this client library.