Jaa


Best practices for concept recognition grammars

As some of you may know, one of the new features with Speech Server 2007 within Office Communications Server is concept recognition.  In this model, you create a grammar by providing training sentences for each possible answer and these are then compiled into a binary .cfg grammar you can use in your application. 

Concept grammar are a very powerful feature and can be of great use in scenarios where you want to ask a question that will elicit a free speech response such as "How may I help you?".  I have already discussed this feature in my post on detecting answering machines.  There are, however, a number of best practices that should be followed when using these grammars to get the best usage of the feature.

First, to be most effective, the concept recognition engine needs several hundred sample sentences per response.  This can be challenging to come up with.  For instance, when working on the answering machine demo I could not come up with this list myself - so I resorted to modifying a list of Internet answering machine jokes.  In a production system, you would need to come up with this list in a more scientific manner.

There are several ways you can determine the training sentences.

1) Research data - for some applications, transcriptions already exist of sample conversations.  If you can find one of these, your work is much easier.

2) Perform "Wizard of Oz" sessions with targeted users.  In a "Wizard of Oz" session, you act as the "Wizard" and speak the prompts of the application to the user and the user responds as if this were an actual call.  The goals of such a session are many - but include verifying call flow and that responses are recognized correctly by the grammar.

3) If you have a live system available for test, log all audio for recognitions and transcribe the audio of users' actual responses.  You can start with a grammar with the obvious training sentences and then expand it with the actual responses from the customers.  This same technique can be used in production systems as well, especially if you are asking a question that can elicit many types of responses.

A second important best practice for concept recognition grammars is to not combine it with another grammar in the same QA.  In general, the concept grammar should be the only rule active on a QA (though simple command grammars are OK).  A concept grammar will always return a match, which will cause problems if you have multiple grammars active, as it will trump the other grammars.  The solution to this is to include all responses within the concept grammar.  So, if you have a concept grammar with five possible responses and another grammar with two possible responses, create one concept grammar with all several possible responses.