Dela via


Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

Speech Recognition (Microsoft.Speech)

The Microsoft.Speech.Recognition namespace provides functionality with which you can acquire and monitor speech input, create speech recognition grammars that produce both literal and semantic recognition results, capture information from events generated by the speech recognition and, and configure and manage speech recognition engines.

Speech Input

With the input functionality of speech recognition, your application can monitor the state, level, and format of the input signal, and receive notification about problems that might interfere with successful recognition. See Audio Input for Recognition (Microsoft.Speech).

Grammars

You can create grammars programmatically using constructors and methods on the GrammarBuilder and Choices classes. Your application can dynamically modify programmatically created grammars while it is running. The structure of grammars authored using these classes is independent of the Speech Recognition Grammar Specification (SRGS) 1.0. See Create Grammars Using GrammarBuilder (Microsoft.Speech).

To create dynamic grammars programmatically that conform to the SRGS specification, see Create Grammars Using SrgsGrammar (Microsoft.Speech). The classes of the Microsoft.Speech.Recognition.SrgsGrammar namespace map closely to the elements and attributes of the SRGS specification.

You can also create grammars as static files, using SRGS-compliant XML markup, that your application can load at runtime. See Create Grammars Using SRGS XML (Microsoft.Speech).

Use the constructors on the Grammar class to compile grammars created with any of the above methods into Grammar objects that the speech recognition can load and use to perform recognition.

Semantics

Recognition engines use the semantic information in grammars to interpret recognition results. To add semantic information to programmatically-created grammars, see Add Semantics to a GrammarBuilder Grammar (Microsoft.Speech). You can add semantic information to XML-format grammars using ECMAScript (JavaScript, JScript) in the tag elements. See Semantic Interpretation Markup (Microsoft.Speech). For information about semantic results returned by speech recognition engines, see Create and Access Semantic Content (Microsoft.Speech).

Debugging Tools

The Microsoft Speech Platform SDK 11 includes a comprehensive suite of command-line tools with which you can test, debug, and optimize your voice applications without first deploying them to a live service. See Microsoft Grammar Development Tools for more information.

In addition, Microsoft.Speech has special emulated recognition modes that allow you to provide text instead of audio to the speech recognition engine. You can use the simulated recognition results for debugging and optimizing speech recognition grammars. See Emulate Spoken Commands (Microsoft.Speech).

Events

Your application can register for events that the speech recognition engine generates when completing important stages during speech processing such as loading grammars or recognizing speech. See Use Speech Recognition Events (Microsoft.Speech).

Recognition Engines

Using the members of the SpeechRecognitionEngine class, you can initialize a speech recognition engine instance, select an installed Runtime Language to use for recognition, load and unload grammars, subscribe to speech recognition events, configure the audio input, start and stop recognition, and modify properties of the speech recognition engine that affect recognition. See Initialize and Manage a Speech Recognition Engine (Microsoft.Speech).

In addition, the speech engines for Microsoft.Speech will recognize the dual-tone multi-frequency (DTMF) tones that are important for developing applications that interact with users by telephone. See DTMF Recognition.

In This Section