speech Package

Reference

Microsoft Speech SDK for Python

Modules

audio	Classes that are concerned with the handling of audio input to the various recognizers, and audio output from the speech synthesizer.
dialog	Classes related to dialog service connector.
enums
intent	Classes related to intent recognition from speech.
interop
languageconfig	Classes that are concerned with the handling of language configurations
properties
speech	Classes related to recognizing text from speech, synthesizing speech from text, and general classes used in the various recognizers.
transcription	Classes related to conversation transcription.
translation	Classes related to translation of speech to other languages.
version

Classes

AudioDataStream	Represents audio data stream used for operating audio data as a stream. Generates an audio data stream from a speech synthesis result (type SpeechSynthesisResult) or a keyword recognition result (type KeywordRecognitionResult).
AutoDetectSourceLanguageResult	Represents auto detection source language result. The result can be initialized from a speech recognition result.
CancellationDetails
Connection	Proxy class for managing the connection to the speech service of the specified Recognizer. By default, a Recognizer autonomously manages connection to service when needed. The Connection class provides additional methods for users to explicitly open or close a connection and to subscribe to connection status changes. The use of Connection is optional. It is intended for scenarios where fine tuning of application behavior based on connection status is needed. Users can optionally call open to manually initiate a service connection before starting recognition on the Recognizer associated with this Connection. After starting a recognition, calling open or close might fail. This will not impact the Recognizer or the ongoing recognition. Connection might drop for various reasons, the Recognizer will always try to reinstitute the connection as required to guarantee ongoing operations. In all these cases connected/disconnected events will indicate the change of the connection status. Note Updated in version 1.17.0. Constructor for internal use.
ConnectionEventArgs	Provides data for the ConnectionEvent. Note Added in version 1.2.0 Constructor for internal use.
EventSignal	Clients can connect to the event signal to receive events, or disconnect from the event signal to stop receiving events. Constructor for internal use.
KeywordRecognitionEventArgs	Class for keyword recognition event arguments. Constructor for internal use.
KeywordRecognitionModel	Represents a keyword recognition model.
KeywordRecognitionResult	Result of a keyword recognition operation. Constructor for internal use.
KeywordRecognizer	A keyword recognizer.
NoMatchDetails
PhraseListGrammar	Class that allows runtime addition of phrase hints to aid in speech recognition. Phrases added to the recognizer are effective at the start of the next recognition, or the next time the speech recognizer must reconnect to the speech service. Note Added in version 1.5.0. Constructor for internal use.
PronunciationAssessmentConfig	Represents pronunciation assessment configuration Note Added in version 1.14.0. The configuration can be initialized in two ways: from parameters: pass reference text, grading system, granularity, enable miscue and scenario id. from json: pass a json string For the parameters details, see https://docs.microsoft.com/azure/cognitive-services/speech-service/rest-speech-to-text#pronunciation-assessment-parameters
PronunciationAssessmentPhonemeResult	Contains phoneme level pronunciation assessment result Note Added in version 1.14.0.
PronunciationAssessmentResult	Represents pronunciation assessment result. Note Added in version 1.14.0. The result can be initialized from a speech recognition result.
PronunciationAssessmentWordResult	Contains word level pronunciation assessment result Note Added in version 1.14.0.
PropertyCollection	Class to retrieve or set a property value from a property collection.
RecognitionEventArgs	Provides data for the RecognitionEvent. Constructor for internal use.
RecognitionResult	Detailed information about the result of a recognition operation. Constructor for internal use.
Recognizer	Base class for different recognizers
ResultFuture	The result of an asynchronous operation. private constructor
SessionEventArgs	Base class for session event arguments. Constructor for internal use.
SourceLanguageRecognizer	A source language recognizer - standalone language recognizer, can be used for single language or continuous language detection. Note Added in version 1.18.0.
SpeechConfig	Class that defines configurations for speech / intent recognition and speech synthesis. The configuration can be initialized in different ways: from subscription: pass a subscription key and a region from endpoint: pass an endpoint. Subscription key or authorization token are optional. from host: pass a host address. Subscription key or authorization token are optional. from authorization token: pass an authorization token and a region
SpeechRecognitionCanceledEventArgs	Class for speech recognition canceled event arguments. Constructor for internal use.
SpeechRecognitionEventArgs	Class for speech recognition event arguments. Constructor for internal use.
SpeechRecognitionResult	Base class for speech recognition results. Constructor for internal use.
SpeechRecognizer	A speech recognizer. If you need to specify source language information, please only specify one of these three parameters, language, source_language_config or auto_detect_source_language_config.
SpeechSynthesisBookmarkEventArgs	Class for speech synthesis bookmark event arguments. Note Added in version 1.16.0. Constructor for internal use.
SpeechSynthesisCancellationDetails	Contains detailed information about why a result was canceled.
SpeechSynthesisEventArgs	Class for speech synthesis event arguments. Constructor for internal use.
SpeechSynthesisResult	Result of a speech synthesis operation. Constructor for internal use.
SpeechSynthesisVisemeEventArgs	Class for speech synthesis viseme event arguments. Note Added in version 1.16.0. Constructor for internal use.
SpeechSynthesisWordBoundaryEventArgs	Class for speech synthesis word boundary event arguments. Note Updated in version 1.21.0. Constructor for internal use.
SpeechSynthesizer	A speech synthesizer.
SyllableLevelTimingResult	Contains syllable level timing result Note Added in version 1.20.0.
SynthesisVoicesResult	Contains detailed information about the retrieved synthesis voices list. Note Added in version 1.16.0. Constructor for internal use.
VoiceInfo	Contains detailed information about the synthesis voice information. Note Updated in version 1.17.0. Constructor for internal use.

Enums

AudioStreamContainerFormat	Defines supported audio stream container format.
AudioStreamWaveFormat	Represents the format specified inside WAV container.
CancellationErrorCode	Defines error code in case that CancellationReason is Error.
CancellationReason	Defines the possible reasons a recognition result might be canceled.
NoMatchReason	Defines the possible reasons a recognition result might not be recognized.
OutputFormat	Output format.
ProfanityOption	Removes profanity (swearing), or replaces letters of profane words with stars.
PronunciationAssessmentGradingSystem	Defines the point system for pronunciation score calibration; default value is FivePoint.
PronunciationAssessmentGranularity	Defines the pronunciation evaluation granularity; default value is Phoneme.
PropertyId	Defines speech property ids.
ResultReason	Specifies the possible reasons a recognition result might be generated.
ServicePropertyChannel	Defines channels used to pass property settings to service.
SpeechSynthesisOutputFormat	Defines the possible speech synthesis output audio formats.
StreamStatus	Defines the possible status of audio data stream.
SynthesisVoiceGender	Defines the gender of synthesis voices
SynthesisVoiceType	Defines the type of synthesis voices

Share via

speech Package

Modules

Classes

Enums

Additional resources