New managed Speech API

I heartily announce that our new managed Speech API is in the Avalon & Indigo Beta 1 RC!

With the System.Speech namespace you can incorporate both speech recognition and speech synthesis in your applications.

Recognition:

The main classes for speech recognition are:

  • DesktopRecognizer: abstracts the recognizer shared by apps on the desktop.

  • SpeechRecognizer: abstracts a recognition engine for exclusive use by your app.

  • RecognitionResult: examine text and semantics returned by a recognizer.

  • SrgsDocument: used to build recognition grammars (the rules for what phrases a recognizer should listen for in your app)

For example, to load a grammar containing your app’s commands into the shared desktop recognizer:

DesktopRecognizer desktopRecognizer = new DesktopRecognizer();

desktopRecognizer.LoadGrammar(new Grammar(new Uri(grammarPath)));

desktopRecognizer.SpeechRecognized += delegate(object sender, RecognitionEventArgs e)

{

// Do appropriate handling when we get a recognition

// Console.WriteLine("User said {0}", e.Result.Text);

};

You’ll also need to have an SR engine installed. There are various ways to get these. Tablets already have an engine. If you have a recent version of Office, you’ll have an engine. You can also download an engine from the SAPI web site https://www.microsoft.com/speech/download/sdk51/.

Synthesis:

The main classes for speech synthesis are:

  • SpeechSynthesizer: abstracts a synthesis engine

  • PromptBuilder: build a prompt string containing emphasis, loudness, pre-recorded sounds, and other characteristics.

For example, if you want your app to say “hello world”, just write:

SpeechSynthesizer synth = new SpeechSynthesizer();

synth.Speak(“Hello world!”);

You can easily splice this with a “ding” wave file by using the PromptBuilder:

PromptBuilder builder = new PromptBuilder();

builder.AddAudio (new Uri (@"file://\windows\media\ding.wav"));

builder.AddText("Hello world!");

SpeechSynthesizer synth = new SpeechSynthesizer();

synth.Speak(builder);

Windows comes with a synthesis engine.

The API uses the W3C standard formats for recognition grammars (SRGS) and synthesis (SSML).

Comments

  • Anonymous
    May 23, 2005
    Today is a big day for speech developers. Check out Robert Brown's post for details about the new managed...
  • Anonymous
    May 23, 2005
    What's that? You're building an application, and you want it to talk to the user, and to understand them...
  • Anonymous
    May 23, 2005
    We released the "Avalon" and "Indigo" Beta 1 RC to the general public earlier today (release notes)....
  • Anonymous
    May 28, 2005
    Our own Robert Brown is on Channel 9 now, with a Scoble interview discussing speech technology and the...
  • Anonymous
    July 23, 2005
    New Avalon, Indigo, WinFX, and Speech API Beta 1 RCWe released the "Avalon" and "Indigo" Beta 1...
  • Anonymous
    September 26, 2006
    The new Speech API is included in the latest Avalon bits that were just released last week. Robert Brown,
  • Anonymous
    June 20, 2007
    The new Speech API is included in the latest Avalon bits that were just released last week. Robert Brown, here in this video, talks about the latest in speech and gives us a demo, and