Compartir a través de


Follow-up to speech processing of audio files

I was asked some questions about my Speech processing of audio files post.

Jonathan Tregear asks "Is there a way to get SAPI to convert and recognize any of the WMA audio formats or other compressed formats beyond the Windows OS standard formats?"

No, SAPI doesn't work directly with WMA audio formats.  You'll need to decode to PCM.  A word of warning here - if you're using a lossy encoding format that's optimized for music (i.e. this doesn't just apply to WMA), you may lose some data that's relevant to speech recognition, and the decoding may introduce artifacts that may confuse speech recognition.  In other words - accuracy will suffer.  If you want accurate recognition, use a lossless format.

Jason Nadal asks "how do I make the api ask for further voice input... ie if I say "run program", I want it to try to recognize a different set of commands than if I say "play media file"..."

I'll address this in another post.  One approach is to have separate grammars for separate sets of commands, and individually enable or disable the grammars as appropriate.

Rex Reece asks "reco.LoadGrammar(New Speech.Recognition.DictationGrammar()) give the following error:
Grammar file not found.
I did add a reference to
C:\WINDOWS\Microsoft.NET\Windows\v6.0.4030\Speech.dll
I am running it under windows 2003 server.
What am I missing???"

Only guessing here.  Take a look in the speech control panel and make sure the language says one of the following: "Microsoft English (U.S.) v6.0 Recognizer"; or "Microsoft English (U.S.) v6.1 Recognizer"; or "Microsoft English Recognizer v5.1".  If you don't have any of these, either install the SAPI 5.1 SDK or install Office and configure speech. Which version of 2003 server are you using?  The 64-bit version doesn't support recognition.