SpeechRecognizer Class

public final class SpeechRecognizer
extends Recognizer

Performs speech recognition from microphone, file, or other audio input streams, and gets transcribed text as result. Note: close() must be called in order to release underlying resources held by the object.

Field Summary

Modifier and Type Field and Description
final EventHandlerImpl<SpeechRecognitionCanceledEventArgs> canceled

The event canceled signals that the recognition was canceled.

final EventHandlerImpl<SpeechRecognitionEventArgs> recognized

The event recognized signals that a final recognition result is received.

final EventHandlerImpl<SpeechRecognitionEventArgs> recognizing

The event recognizing signals that an intermediate recognition result is received.

Constructor Summary

Constructor Description
SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition.

SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition.

SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition.

SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition.

SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

SpeechRecognizer(SpeechConfig speechConfig)

Initializes a new instance of Speech Recognizer.

SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)

Initializes a new instance of Speech Recognizer.

SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig)

Initializes a new instance of Speech Recognizer.

SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

SpeechRecognizer(SpeechConfig speechConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage)

Initializes a new instance of Speech Recognizer.

SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

Method Summary

Modifier and Type Method and Description
protected void dispose(boolean disposing)

This method performs cleanup of resources.

java.lang.String getAuthorizationToken()

Gets the authorization token used to communicate with the service.

java.lang.String getEndpointId()

Gets the endpoint ID of a customized speech model that is used for speech recognition.

OutputFormat getOutputFormat()

Gets the output format of recognition.

PropertyCollection getProperties()

The collection of properties and their values defined for this SpeechRecognizer.

java.lang.String getSpeechRecognitionLanguage()

Gets the spoken language of recognition.

java.util.concurrent.Future<SpeechRecognitionResult> recognizeOnceAsync()

Starts speech recognition, and returns after a single utterance is recognized.

void setAuthorizationToken(String token)

Sets the authorization token used to communicate with the service.

java.util.concurrent.Future<java.lang.Void> startContinuousRecognitionAsync()

Starts speech recognition on a continuous audio stream, until stopContinuousRecognitionAsync() is called.

java.util.concurrent.Future<java.lang.Void> startKeywordRecognitionAsync(KeywordRecognitionModel model)

Configures the recognizer with the given keyword model.

java.util.concurrent.Future<java.lang.Void> stopContinuousRecognitionAsync()

Stops a running recognition operation as soon as possible and immediately requests a result based on the the input that has been processed so far.

java.util.concurrent.Future<java.lang.Void> stopKeywordRecognitionAsync()

Ends the keyword initiated recognition.

Methods inherited from Recognizer

Methods inherited from java.lang.Object

java.lang.Object.clone java.lang.Object.equals java.lang.Object.finalize java.lang.Object.getClass java.lang.Object.hashCode java.lang.Object.notify java.lang.Object.notifyAll java.lang.Object.toString java.lang.Object.wait java.lang.Object.wait java.lang.Object.wait

Field Details

canceled

public final EventHandlerImpl canceled

The event canceled signals that the recognition was canceled.

recognized

public final EventHandlerImpl recognized

The event recognized signals that a final recognition result is received.

recognizing

public final EventHandlerImpl recognizing

The event recognizing signals that an intermediate recognition result is received.

Constructor Details

SpeechRecognizer

public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.19.0

Parameters:

embeddedSpeechConfig - embedded speech configuration.

SpeechRecognizer

public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.20.0

Parameters:

embeddedSpeechConfig - embedded speech configuration.
autoDetectSourceLangConfig - configuration for auto detecting the source language.

SpeechRecognizer

public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.20.0

Parameters:

embeddedSpeechConfig - embedded speech configuration.
autoDetectSourceLangConfig - configuration for auto detecting the source language.
audioConfig - audio configuration.

SpeechRecognizer

public SpeechRecognizer(EmbeddedSpeechConfig embeddedSpeechConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for embedded speech recognition. Added in version 1.19.0

Parameters:

embeddedSpeechConfig - embedded speech configuration.
audioConfig - audio configuration.

SpeechRecognizer

public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

Parameters:

hybridSpeechConfig - hybrid speech configuration.

SpeechRecognizer

public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

Parameters:

hybridSpeechConfig - hybrid speech configuration.
autoDetectSourceLangConfig - the configuration for auto detecting source language

SpeechRecognizer

public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

Parameters:

hybridSpeechConfig - hybrid speech configuration.
autoDetectSourceLangConfig - the configuration for auto detecting source language
audioConfig - audio configuration.

SpeechRecognizer

public SpeechRecognizer(HybridSpeechConfig hybridSpeechConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer for hybrid speech recognition.

Parameters:

hybridSpeechConfig - hybrid speech configuration.
audioConfig - audio configuration.

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.
autoDetectSourceLangConfig - the configuration for auto detecting source language

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig, AutoDetectSourceLanguageConfig autoDetectSourceLangConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.
autoDetectSourceLangConfig - the configuration for auto detecting source language
audioConfig - audio configuration.

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.
sourceLanguageConfig - the configuration for source language

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig, SourceLanguageConfig sourceLanguageConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.
sourceLanguageConfig - the configuration for source language
audioConfig - audio configuration.

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.
audioConfig - audio configuration.

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.
sourceLanguage - the recognition source language

SpeechRecognizer

public SpeechRecognizer(SpeechConfig speechConfig, String sourceLanguage, AudioConfig audioConfig)

Initializes a new instance of Speech Recognizer.

Parameters:

speechConfig - speech configuration.
sourceLanguage - the recognition source language
audioConfig - audio configuration.

Method Details

dispose

protected void dispose(boolean disposing)

This method performs cleanup of resources. The Boolean parameter disposing indicates whether the method is called from Dispose (if disposing is true) or from the finalizer (if disposing is false). Derived classes should override this method to dispose resource if needed.

Overrides:

SpeechRecognizer.dispose(boolean disposing)

Parameters:

disposing

getAuthorizationToken

public String getAuthorizationToken()

Gets the authorization token used to communicate with the service.

Returns:

Authorization token.

getEndpointId

public String getEndpointId()

Gets the endpoint ID of a customized speech model that is used for speech recognition.

Returns:

the endpoint ID of a customized speech model that is used for speech recognition.

getOutputFormat

public OutputFormat getOutputFormat()

Gets the output format of recognition.

Returns:

The output format of recognition.

getProperties

public PropertyCollection getProperties()

The collection of properties and their values defined for this SpeechRecognizer.

Returns:

The collection of properties and their values defined for this SpeechRecognizer.

getSpeechRecognitionLanguage

public String getSpeechRecognitionLanguage()

Gets the spoken language of recognition.

Returns:

The spoken language of recognition.

recognizeOnceAsync

public Future recognizeOnceAsync()

Starts speech recognition, and returns after a single utterance is recognized. The end of a single utterance is determined by listening for silence at the end or until a maximum of about 30 seconds of audio is processed. The task returns the recognition text as result. Note: Since recognizeOnceAsync() returns only a single utterance, it is suitable only for single shot recognition like command or query. For long-running multi-utterance recognition, use startContinuousRecognitionAsync() instead.

Returns:

A task representing the recognition operation. The task returns a value of SpeechRecognitionResult

setAuthorizationToken

public void setAuthorizationToken(String token)

Sets the authorization token used to communicate with the service. Note: The caller needs to ensure that the authorization token is valid. Before the authorization token expires, the caller needs to refresh it by calling this setter with a new valid token. Otherwise, the recognizer will encounter errors during recognition.

Parameters:

token - Authorization token.

startContinuousRecognitionAsync

public Future startContinuousRecognitionAsync()

Starts speech recognition on a continuous audio stream, until stopContinuousRecognitionAsync() is called. User must subscribe to events to receive recognition results.

Returns:

A task representing the asynchronous operation that starts the recognition.

startKeywordRecognitionAsync

public Future startKeywordRecognitionAsync(KeywordRecognitionModel model)

Configures the recognizer with the given keyword model. After calling this method, the recognizer is listening for the keyword to start the recognition. Call stopKeywordRecognitionAsync() to end the keyword initiated recognition. User must subscribe to events to receive recognition results.

Parameters:

model - The keyword recognition model that specifies the keyword to be recognized.

Returns:

A task representing the asynchronous operation that starts the recognition.

stopContinuousRecognitionAsync

public Future stopContinuousRecognitionAsync()

Stops a running recognition operation as soon as possible and immediately requests a result based on the the input that has been processed so far. This works for all recognition operations, not just continuous ones, and facilitates the use of push-to-talk or "finish now" buttons for manual audio endpointing.

Returns:

A future that will complete when input processing has been stopped. Result generation, if applicable for the input provided, may happen after this task completes and should be handled with the appropriate event.

stopKeywordRecognitionAsync

public Future stopKeywordRecognitionAsync()

Ends the keyword initiated recognition.

Returns:

A task representing the asynchronous operation that stops the recognition.

Applies to