Udostępnij za pośrednictwem


Windows Speech Recognition in Vista: Dictation Everywhere

Here's a question from a reader about the Dictation Everywhere feature in Windows Speech Recognition in Windows Vista:

This feature is thus very important to me and my future as a PC user.

I have found that in many form fields (such as the one I'm using to send this message) dictation is impossible unless I switch to "Enable dictation everywhere" (otherwise I get "what was that?" messages). There is no short command for switching, and even in that mode I have to confirm every fragment of text I dictate with "one, ok." And then I have to switch out of that mode again for navigation.

It is very important to me that I find ways around this, or at least that a way around it will soon exist. Would you be so kind as to let me know how I can stay up to date with Vista SR improvements, and whom I can contact about its development?

Thank you very much for your time; I really appreciate your efforts to communicate to the public about Vista SR.

Now, for those of you that don't already know, Windows Speech Recognition allows users to dictate into any text field that supports a few standard platform APIs. These APIs (Application Programming Interfaces) are used by the speech recognition system, the handwriting recognition system, and input method editors for foreign languages.

However, some text fields in custom applications don't support what's needed. In those cases Windows Speech Recognition has a fall back input method. It's called Dictation Everywhere. You can turn it on by doing this:

  • Say, "Show Speech Options"
  • Say, "Options"
  • Say, "Dictation Everywhere"

That will toggle it on. Doing those three steps again will turn it off.

When Dictation Everywhere is turned on, we'll listen for the user to speak dictated text, even when they're not in an text field that doesn't support those APIs I mentioned. But ... Because the field doesn't support the right APIs, instead of just sticking the text in there, we're going to have a miniature correction experience with the user first. Otherwise, the user wouldn't be able to correct the text if there was a recognition error.

So ... If you're in one of those fields, use the three steps above to turn on dictation, then you can say, "Hello <period> This is a test <period>". The correction dialog will pop up and allow you to pick an alternative from the list. If you don't see what you really said, simple say it again. If you still don't see what you said, you can say "Spell it", and spell what you wanted.

This is the feature that the user was asking about. They'd like to be able to turn the command on and off easily by using a single voice command.

Unfortunately, since we don't have an end user feature either included directly into Vista at this time, nor do we offer one for download (yet!) for creating macros, end users can't really simulate the same impact of turning this feature on and off with a single voice command.

At least not easily...

If you're a programmer (or don't mind dabbling), you certainly could though. In fact, you can create a simple shell script that does this like this:

set Recognizer = CreateObject("SAPI.SpSharedRecognizer")

Recognizer.EmulateRecognition ("show speech options")
WScript.Sleep(1000)
Recognizer.EmulateRecognition ("options")
WScript.Sleep(1000)
Recognizer.EmulateRecognition ("dictation everywhere")

set Recognizer = Nothing

When you run this it will connect to the shared recognizer that Windows Speech Recognition uses, it will pretend that the user spoke "show speech options", then wait for 1 second, then pretend that the user said "options", again wait for another 1 second, then again, pretend the user said "dictation everywhere".

In fact, you can even save this text as a file called "Dictation Everywhere Toggle.vbs" in your start folder (e.g. "c:\documents and settings\{your user name goes here}\start menu\dictation everywhere toggle.vbs") and you'll be able to say to Windows Speech Recognition, "Start Dictation Everywhere Toggle".

Unfortunately, for all this to work, you actually have to turn Access Control (UAC) off. Otherwise, the shell script can't communicate with the shared recognizer.

In the future, though, we'll have a true end to end macro facility to deal with this in a secure way. Stay tuned for more info on that front...

Comments

  • Anonymous
    April 24, 2007
    Hi Rob, I have been thinking about writing a Visual Studio 2005 add-in that adds dictation support to the code editor windows and command support to the menus and buttons. Do you think it would be possible to do the dictation using Text Services Framework via System.Windows.Input? It would inform the speech recognition engine about the state of the window obtained from the add-in api. Thanks

  • Anonymous
    April 24, 2007
    The comment has been removed

  • Anonymous
    April 25, 2007
    The comment has been removed