IndexingParametersConfiguration Class
- java.
lang. Object - com.
azure. search. documents. indexes. models. IndexingParametersConfiguration
- com.
Implements
public final class IndexingParametersConfiguration
implements JsonSerializable<IndexingParametersConfiguration>
A dictionary of indexer-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.
Constructor Summary
Constructor | Description |
---|---|
IndexingParametersConfiguration() |
Creates an instance of Indexing |
Method Summary
Modifier and Type | Method and Description |
---|---|
static
Indexing |
fromJson(JsonReader jsonReader)
Reads an instance of Indexing |
Map<String,Object> |
getAdditionalProperties()
Get the additional |
Blob |
getDataToExtract()
Get the data |
String |
getDelimitedTextDelimiter()
Get the delimited |
String |
getDelimitedTextHeaders()
Get the delimited |
String |
getDocumentRoot()
Get the document |
String |
getExcludedFileNameExtensions()
Get the excluded |
Indexer |
getExecutionEnvironment()
Get the execution |
Blob |
getImageAction()
Get the image |
String |
getIndexedFileNameExtensions()
Get the indexed |
Blob |
getParsingMode()
Get the parsing |
Blob |
getPdfTextRotationAlgorithm()
Get the pdf |
String |
getQueryTimeout()
Get the query |
Boolean |
isAllowSkillsetToReadFileData()
Get the allow |
Boolean |
isFailOnUnprocessableDocument()
Get the fail |
Boolean |
isFailOnUnsupportedContentType()
Get the fail |
Boolean |
isFirstLineContainsHeaders()
Get the first |
Boolean |
isIndexStorageMetadataOnlyForOversizedDocuments()
Get the index |
Indexing |
setAdditionalProperties(Map<String,Object> additionalProperties)
Set the additional |
Indexing |
setAllowSkillsetToReadFileData(Boolean allowSkillsetToReadFileData)
Set the allow |
Indexing |
setDataToExtract(BlobIndexerDataToExtract dataToExtract)
Set the data |
Indexing |
setDelimitedTextDelimiter(String delimitedTextDelimiter)
Set the delimited |
Indexing |
setDelimitedTextHeaders(String delimitedTextHeaders)
Set the delimited |
Indexing |
setDocumentRoot(String documentRoot)
Set the document |
Indexing |
setExcludedFileNameExtensions(String excludedFileNameExtensions)
Set the excluded |
Indexing |
setExecutionEnvironment(IndexerExecutionEnvironment executionEnvironment)
Set the execution |
Indexing |
setFailOnUnprocessableDocument(Boolean failOnUnprocessableDocument)
Set the fail |
Indexing |
setFailOnUnsupportedContentType(Boolean failOnUnsupportedContentType)
Set the fail |
Indexing |
setFirstLineContainsHeaders(Boolean firstLineContainsHeaders)
Set the first |
Indexing |
setImageAction(BlobIndexerImageAction imageAction)
Set the image |
Indexing |
setIndexStorageMetadataOnlyForOversizedDocuments(Boolean indexStorageMetadataOnlyForOversizedDocuments)
Set the index |
Indexing |
setIndexedFileNameExtensions(String indexedFileNameExtensions)
Set the indexed |
Indexing |
setParsingMode(BlobIndexerParsingMode parsingMode)
Set the parsing |
Indexing |
setPdfTextRotationAlgorithm(BlobIndexerPdfTextRotationAlgorithm pdfTextRotationAlgorithm)
Set the pdf |
Indexing |
setQueryTimeout(String queryTimeout)
Set the query |
Json |
toJson(JsonWriter jsonWriter) |
Methods inherited from java.lang.Object
Constructor Details
IndexingParametersConfiguration
public IndexingParametersConfiguration()
Creates an instance of IndexingParametersConfiguration class.
Method Details
fromJson
public static IndexingParametersConfiguration fromJson(JsonReader jsonReader)
Reads an instance of IndexingParametersConfiguration from the JsonReader.
Parameters:
Returns:
Throws:
getAdditionalProperties
public Map
Get the additionalProperties property: A dictionary of indexer-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.
Returns:
getDataToExtract
public BlobIndexerDataToExtract getDataToExtract()
Get the dataToExtract property: Specifies the data to extract from Azure blob storage and tells the indexer which data to extract from image content when "imageAction" is set to a value other than "none". This applies to embedded image content in a .PDF or other application, or image files such as .jpg and .png, in Azure blobs.
Returns:
getDelimitedTextDelimiter
public String getDelimitedTextDelimiter()
Get the delimitedTextDelimiter property: For CSV blobs, specifies the end-of-line single-character delimiter for CSV files where each line starts a new document (for example, "|").
Returns:
getDelimitedTextHeaders
public String getDelimitedTextHeaders()
Get the delimitedTextHeaders property: For CSV blobs, specifies a comma-delimited list of column headers, useful for mapping source fields to destination fields in an index.
Returns:
getDocumentRoot
public String getDocumentRoot()
Get the documentRoot property: For JSON arrays, given a structured or semi-structured document, you can specify a path to the array using this property.
Returns:
getExcludedFileNameExtensions
public String getExcludedFileNameExtensions()
Get the excludedFileNameExtensions property: Comma-delimited list of filename extensions to ignore when processing from Azure blob storage. For example, you could exclude ".png, .mp4" to skip over those files during indexing.
Returns:
getExecutionEnvironment
public IndexerExecutionEnvironment getExecutionEnvironment()
Get the executionEnvironment property: Specifies the environment in which the indexer should execute.
Returns:
getImageAction
public BlobIndexerImageAction getImageAction()
Get the imageAction property: Determines how to process embedded images and image files in Azure blob storage. Setting the "imageAction" configuration to any value other than "none" requires that a skillset also be attached to that indexer.
Returns:
getIndexedFileNameExtensions
public String getIndexedFileNameExtensions()
Get the indexedFileNameExtensions property: Comma-delimited list of filename extensions to select when processing from Azure blob storage. For example, you could focus indexing on specific application files ".docx, .pptx, .msg" to specifically include those file types.
Returns:
getParsingMode
public BlobIndexerParsingMode getParsingMode()
Get the parsingMode property: Represents the parsing mode for indexing from an Azure blob data source.
Returns:
getPdfTextRotationAlgorithm
public BlobIndexerPdfTextRotationAlgorithm getPdfTextRotationAlgorithm()
Get the pdfTextRotationAlgorithm property: Determines algorithm for text extraction from PDF files in Azure blob storage.
Returns:
getQueryTimeout
public String getQueryTimeout()
Get the queryTimeout property: Increases the timeout beyond the 5-minute default for Azure SQL database data sources, specified in the format "hh:mm:ss".
Returns:
isAllowSkillsetToReadFileData
public Boolean isAllowSkillsetToReadFileData()
Get the allowSkillsetToReadFileData property: If true, will create a path //document//file_data that is an object representing the original file data downloaded from your blob data source. This allows you to pass the original file data to a custom skill for processing within the enrichment pipeline, or to the Document Extraction skill.
Returns:
isFailOnUnprocessableDocument
public Boolean isFailOnUnprocessableDocument()
Get the failOnUnprocessableDocument property: For Azure blobs, set to false if you want to continue indexing if a document fails indexing.
Returns:
isFailOnUnsupportedContentType
public Boolean isFailOnUnsupportedContentType()
Get the failOnUnsupportedContentType property: For Azure blobs, set to false if you want to continue indexing when an unsupported content type is encountered, and you don't know all the content types (file extensions) in advance.
Returns:
isFirstLineContainsHeaders
public Boolean isFirstLineContainsHeaders()
Get the firstLineContainsHeaders property: For CSV blobs, indicates that the first (non-blank) line of each blob contains headers.
Returns:
isIndexStorageMetadataOnlyForOversizedDocuments
public Boolean isIndexStorageMetadataOnlyForOversizedDocuments()
Get the indexStorageMetadataOnlyForOversizedDocuments property: For Azure blobs, set this property to true to still index storage metadata for blob content that is too large to process. Oversized blobs are treated as errors by default. For limits on blob size, see https://learn.microsoft.com/azure/search/search-limits-quotas-capacity.
Returns:
setAdditionalProperties
public IndexingParametersConfiguration setAdditionalProperties(Map
Set the additionalProperties property: A dictionary of indexer-specific configuration properties. Each name is the name of a specific property. Each value must be of a primitive type.
Parameters:
Returns:
setAllowSkillsetToReadFileData
public IndexingParametersConfiguration setAllowSkillsetToReadFileData(Boolean allowSkillsetToReadFileData)
Set the allowSkillsetToReadFileData property: If true, will create a path //document//file_data that is an object representing the original file data downloaded from your blob data source. This allows you to pass the original file data to a custom skill for processing within the enrichment pipeline, or to the Document Extraction skill.
Parameters:
Returns:
setDataToExtract
public IndexingParametersConfiguration setDataToExtract(BlobIndexerDataToExtract dataToExtract)
Set the dataToExtract property: Specifies the data to extract from Azure blob storage and tells the indexer which data to extract from image content when "imageAction" is set to a value other than "none". This applies to embedded image content in a .PDF or other application, or image files such as .jpg and .png, in Azure blobs.
Parameters:
Returns:
setDelimitedTextDelimiter
public IndexingParametersConfiguration setDelimitedTextDelimiter(String delimitedTextDelimiter)
Set the delimitedTextDelimiter property: For CSV blobs, specifies the end-of-line single-character delimiter for CSV files where each line starts a new document (for example, "|").
Parameters:
Returns:
setDelimitedTextHeaders
public IndexingParametersConfiguration setDelimitedTextHeaders(String delimitedTextHeaders)
Set the delimitedTextHeaders property: For CSV blobs, specifies a comma-delimited list of column headers, useful for mapping source fields to destination fields in an index.
Parameters:
Returns:
setDocumentRoot
public IndexingParametersConfiguration setDocumentRoot(String documentRoot)
Set the documentRoot property: For JSON arrays, given a structured or semi-structured document, you can specify a path to the array using this property.
Parameters:
Returns:
setExcludedFileNameExtensions
public IndexingParametersConfiguration setExcludedFileNameExtensions(String excludedFileNameExtensions)
Set the excludedFileNameExtensions property: Comma-delimited list of filename extensions to ignore when processing from Azure blob storage. For example, you could exclude ".png, .mp4" to skip over those files during indexing.
Parameters:
Returns:
setExecutionEnvironment
public IndexingParametersConfiguration setExecutionEnvironment(IndexerExecutionEnvironment executionEnvironment)
Set the executionEnvironment property: Specifies the environment in which the indexer should execute.
Parameters:
Returns:
setFailOnUnprocessableDocument
public IndexingParametersConfiguration setFailOnUnprocessableDocument(Boolean failOnUnprocessableDocument)
Set the failOnUnprocessableDocument property: For Azure blobs, set to false if you want to continue indexing if a document fails indexing.
Parameters:
Returns:
setFailOnUnsupportedContentType
public IndexingParametersConfiguration setFailOnUnsupportedContentType(Boolean failOnUnsupportedContentType)
Set the failOnUnsupportedContentType property: For Azure blobs, set to false if you want to continue indexing when an unsupported content type is encountered, and you don't know all the content types (file extensions) in advance.
Parameters:
Returns:
setFirstLineContainsHeaders
public IndexingParametersConfiguration setFirstLineContainsHeaders(Boolean firstLineContainsHeaders)
Set the firstLineContainsHeaders property: For CSV blobs, indicates that the first (non-blank) line of each blob contains headers.
Parameters:
Returns:
setImageAction
public IndexingParametersConfiguration setImageAction(BlobIndexerImageAction imageAction)
Set the imageAction property: Determines how to process embedded images and image files in Azure blob storage. Setting the "imageAction" configuration to any value other than "none" requires that a skillset also be attached to that indexer.
Parameters:
Returns:
setIndexStorageMetadataOnlyForOversizedDocuments
public IndexingParametersConfiguration setIndexStorageMetadataOnlyForOversizedDocuments(Boolean indexStorageMetadataOnlyForOversizedDocuments)
Set the indexStorageMetadataOnlyForOversizedDocuments property: For Azure blobs, set this property to true to still index storage metadata for blob content that is too large to process. Oversized blobs are treated as errors by default. For limits on blob size, see https://learn.microsoft.com/azure/search/search-limits-quotas-capacity.
Parameters:
Returns:
setIndexedFileNameExtensions
public IndexingParametersConfiguration setIndexedFileNameExtensions(String indexedFileNameExtensions)
Set the indexedFileNameExtensions property: Comma-delimited list of filename extensions to select when processing from Azure blob storage. For example, you could focus indexing on specific application files ".docx, .pptx, .msg" to specifically include those file types.
Parameters:
Returns:
setParsingMode
public IndexingParametersConfiguration setParsingMode(BlobIndexerParsingMode parsingMode)
Set the parsingMode property: Represents the parsing mode for indexing from an Azure blob data source.
Parameters:
Returns:
setPdfTextRotationAlgorithm
public IndexingParametersConfiguration setPdfTextRotationAlgorithm(BlobIndexerPdfTextRotationAlgorithm pdfTextRotationAlgorithm)
Set the pdfTextRotationAlgorithm property: Determines algorithm for text extraction from PDF files in Azure blob storage.
Parameters:
Returns:
setQueryTimeout
public IndexingParametersConfiguration setQueryTimeout(String queryTimeout)
Set the queryTimeout property: Increases the timeout beyond the 5-minute default for Azure SQL database data sources, specified in the format "hh:mm:ss".
Parameters:
Returns:
toJson
Applies to
Azure SDK for Java