Creating Prompts
This topic is the fourth of seven tutorial topics covering tools in the Microsoft Speech Application SDK Version 1.1 (SASDK). This section of the tutorial uses Speech Prompt Editor, which is the tool for recording prompts in Microsoft Visual Studio .NET 2003. Prompts are utterances in audio files that the application plays to elicit feedback or action from the user.
For reference information on Speech Prompt Editor, see Prompting the User. See Designing Dialogue Flow for more information on the SpeechControlSettings control.
This sequence of tutorials demonstrates how to build a simple voice-only ASP.NET Web application using the SASDK. Specifically, the tutorials demonstrate building a Start page of an imaginary pizza ordering service, for use by telephony Speech Application Language Tags (SALT) clients.
The procedures for creating the pizza ordering service application build on each other. Performing the procedures in sequence is therefore important.
Creating a Prompt Database
All the prompts that an application uses are kept in a prompt database. Speech Web Application Wizard added a prompt database file to the project when it first created the project. To create a prompt project in a language other than English, on the prompt project Properties Pages use the Language drop-down to select any of the installed languages. This tutorial develops a speech application and prompt project using the default language setting, U.S. English. For more information on developing prompt projects in other languages, see the topic Managing Prompt Projects.
The first step in adding prompts to the prompt database is to define the prompts the application needs. The previous section, Creating the Dialogue Framework, added the following prompts to QA controls.
Welcome to Tony's Pizza. Order a pizza now and we'll have it waiting when you arrive. Say Cancel at any time to cancel this order. |
We have small, medium and large sizes. What size would you like? |
Please say your telephone number, area code first. |
Thanks for your order. Give us your phone number when you arrive and we'll give you your pizza. |
Additional confirmation prompts, which subsequent controls will play, require portions of the prompt text shown above. For that reason, it is useful to record the prompts so that some portions can be reused in many different prompts. Those portions are called extractions. This section demonstrates how to create a database of prompts with minimal effort and recording time using extractions.
The next step is to create recordings for these prompts so they are not synthesized by the text-to-speech (TTS) engine.
Entering recording transcriptions
- In Solution Explorer, double-click Prompts.promptdb to display Speech Prompt Editor.
- In the Transcription pane, double-click the first column in the first row, type Welcome to Tony's Pizza. in the edit box, and then press the down arrow key.
The columns Display Text, Has Wave and Has Alignments fill in automatically, and a new edit box opens on the next line. Note the X in the red circle in the Has Wave and Has Alignments columns. These notations indicate that the transcription has no audio yet.
Type the next transcription, Order a pizza now, and press the down arrow key.
Type the next transcription, and we'll have it waiting when you arrive., and press the down arrow key.
Continue adding the following transcriptions:
Say Cancel at any time to cancel this order.
We have small, medium and large sizes.
What size would you like?
Please say your telephone number, area code first.
Please call again.
You ordered a large pizza.
and your telephone number is
Is that correct?
Thanks for your order.
Give us your phone number when you arrive
and we'll give you your pizza.
Your order has been canceled. Goodbye.At this point the Transcription pane looks like the following illustration.
Note: In the Transcription column, certain punctuation in the prompt text does not display: commas, periods and question marks, for instance. This punctuation, however, ensures that the recordings sound right for the context. In the preceding transcriptions, for example, the comma between small and medium indicates to the person recording the transcription that there should be a slight pause between the words to make the recording sound natural. Similarly, the apostrophe in we'll distinguishes it from well. Written text provided to the person recording the transcriptions needs to include all punctuation.
- Right-click the Prompts.promptdb tab and select Save Prompts.promptdb to save the PizzaPrompts project.
Creating Extractions
The next step is to create extractions. An extraction can be either an entire phrase, or a small segment of a phrase that can be joined with other segments to create whole phrases. It is possible to create phrases that are not in the transcriptions from extractions. To create an extraction, place brackets ([]) around the word or phrase in the Transcription pane.
To create an extraction
- In the upper pane, widen the Transcription column so that the entire text of the welcome prompt is visible.
- Double-click the transcription text, and at the very start of the welcome prompt, type an opening square bracket ([).
- Move the cursor to the very end of the welcome prompt and type a closing square bracket (]).
- Move the cursor to the next transcription and press Enter.
The Extraction column in the lower pane now displays the new extraction, which is enclosed in brackets in the prompt text in the Transcription column in the Transcription pane.
Tip Each transcription can have a user-defined identifier in the Extraction ID column for maintaining and updating prompt databases. See the section Creating Extractions in Entering Transcriptions for more information.
To enter the remaining extractions
Repeat the previous procedure to mark the entire transcription for the following three extractions:
Order a pizza now and we'll have it waiting when you arrive. Say Cancel at any time to cancel this order. In the next transcription, create several separate extractions, starting with We have, and continuing with small, medium, and, and sizes. Enclose each of those words or phrases in square brackets. Later, the application uses those separate extractions to confirm the pizza choices to the caller.
Note The previous step deliberately omits large. The tutorial later uses that missing extraction to demonstrate how to check prompt coverage.
Similarly, create the following extractions by enclosing the prompt text in brackets.
What size would you like
say your telephone number, area code first
Please, from the transcription Please call again.
You ordered a
pizza (Skip large in the transcription here as well.)
and your telephone number is
Is that correct
Thanks for your order
Give us your phone number when you arrive
and we'll give you your pizza
Your order has been canceled. Goodbye.
Tip It is best to take the extraction for Please from the transcription Please call again. If the extraction were to come from the transcription Please say your telephone number, area code first, which has two sibilants (the s sounds in please and say) next to each other, those sibilants could run together in the recording. This technique is useful where transcriptions have a sound at the end of a word and the same sound repeated at the start of the next word.
- Right-click the Prompts.promptdb tab and select Save Prompts.promptdb from the context menu to save the PizzaPrompts project.
The Transcription and Extraction panes now look like the following illustration.
With the extractions from those transcriptions, the application can respond with prompts that are not recorded. In response to the caller's choice of medium and pepperoni, for instance, the application can prompt the caller with the phrase You ordered a medium pepperoni pizza. Is that correct?.
Recording Prompts
Now that all transcripts and extractions are ready, it is possible to record the audio data for each transcription. Make sure that a microphone is connected to the computer and configured for recording.
To record the audio data for each transcription
- Select the first transcription.
- On the Speech Prompt Editor toolbar, click Record All (the red circle), or press SHIFT+ALT+E.
- The Recording Tool appears, displaying the text of the first transcription.
In the Recording Tool, click Record.
Speak the Display Text that appears in the Recording Tool window.
Click Stop.
Wait for the audio to process. The speech recognition (SR) engine processes the audio and creates alignments. Alignments mark the end of each word in the audio data.
When the SR engine finishes processing, a small wave icon with an asterisk appears in the Has Wave column of the Transcription pane. Also, a green check mark appears in the Has Alignments column. If the green check mark does not appear, alignment failed. If a failure occurs, re-record. The Transcription pane now looks like the following illustration.
- Press Play to check the recording quality. If necessary, press Record to re-record the transcription.
- Click Next to continue to the next transcription. Continue recording until audio data and alignments are present for each transcription.
- Right-click the Prompts.promptdb tab and select Save Prompts.promptdb from the context menu to save the PizzaPrompts project.
Building the Prompts File
Once prompts are specified and recorded, build the prompts file. Building the prompts file compiles extractions and audio data from the .promptdb file into a single file that the Microsoft prompt engine uses. This is a .prompts file, which is added to the Web location for the application.
- To build, on the Build menu choose Build Solution.
Messages appear in the Output window indicating that the audio data is processing. A message might display only on the first build (or if Rebuild Solution on the Build menu is selected). Otherwise, the processed audio data is saved for faster builds. If an error occurs, an error message appears in the Output window.
Building the .prompts file makes the prompts available to the application. If the application tries to speak a prompt and cannot find the entire prompt, either as a single extraction or as a combination of several extractions, in the database, a synthesized (text-to-speech) voice is used instead. The next section demonstrates how to make sure the application has full prompt coverage.
Validating Prompt Coverage
In the previous section, the word large did not appear in the extractions and was never recorded. Because it was not recorded, when the application attempts to say a prompt with the word large in it, the prompt uses the TTS engine. For example, the TTS engine would play the following confirmation:
You ordered a large pizza.
The Validate Tool helps catch omissions like this before the application is deployed.
To open the Validate Tool
- On the View menu, click Prompt Validation.
The Prompt Validation window contains an empty edit box and a toolbar with several buttons. To validate prompts manually, type prompt text in the edit box, and click Do Validate on the Prompt Validation toolbar.
A Prompt Validation Example
Type the following two lines of text in the Prompt Validation edit box:
You ordered a medium pizza. You ordered a large pizza. Click Do Validate, the first button on the toolbar.
The project saves and builds, and the Prompt Validation Results window appears.To expand the results tree, click the "+" symbol next to Prompt Validation Results.
The tree expands into two branches. The word "large" appears in red as shown in the following illustration, because it was not found in the database.
- Click Next Error on the toolbar in the Prompt Validation Results window.
The sentence with "large" is highlighted. - On the Prompt Validation Results toolbar, click Add to database .
This creates a new transcription in the Transcription pane and a new extraction in the Extraction pane. Now it is possible to record the transcription, and then try the validation again. See the previous section for instructions on recording prompts. If you record the transcription, be sure to save the prompt database afterwards. - To hear a prompt with all the extractions put together, select the prompt in the Prompt Validation Results window, and click Play Output on the toolbar.
Specifying Common Speech Control Settings
In the tutorial application, all the prompts are in one database, prompts.promptdb, which creates a single prompt file in the Web directory, PizzaPrompts.prompts. Each Speech QA control in the application needs to indicate the source of its prompts—in this case, PizzaPrompts.prompts. You specify this common source with the SpeechControlSettings control, which specifies common settings for controls in application.
To specify PizzaPrompts.prompts for all QA controls
- Open Default.aspx.
- On the Toolbox, click the Speech tab, and drag a SpeechControlSettings control onto the canvas.
- In the Properties pane, change the ID of the control to QASpeechControlSettings in the ID edit box.
- In the Properties pane for the control, select Items.
- Click the ellipsis button (...) to display the SpeechControlSettingsItem Collection Editor.
- Click Add to add a SpeechControlSettingsItem.
- Click the + button next to QA to expand the QA items.
- In the QA items, click the + button next to Prompt to expand the Prompt items.
- Select the PromptDatabases item and click the ellipsis button (...) to display the PromptDatabase Collection Editor.
- Click Add to add a PromptDatabase.
- Under Speech in the PromptDatabase Properties, click Source and insert the URL for the PizzaPrompts.prompts database in your application.
By default in the tutorial, the URL is Prompts/PizzaPrompts.prompts. - Click OK twice to exit the collection editors.
Now specify this Speech Control setting for all the QA controls in the application.
To specify the Speech Control setting in the QA controls
- Right-click WelcomeQA, and select Property Builder.
- In the treeview select QA, then select SpeechControlSettingsItem1 from the Speech Settings Item drop-down list.
- Click OK.
- Repeat steps 1 through 3 for all the QA controls.
- Right-click the Default.aspx tab, and select Save Default.aspx from the context menu.
The application now gets the audio for all prompts for all QA controls from the recordings in PizzaPrompts.prompts. You can check the prompt playback now.
To check the prompt playback
- On the Debug menu select Start. SASDK first builds the application, and then opens Speech Debugging Console and Telephony Application Simulator (TASim), with TASim on top.
- On TASim, click Dial. Through your headset or speakers, the application plays back the prompts in all the QA controls, one after the other, from the prompt database.
- In TASim, click Hang Up, then click Exit in the warning dialog box.
- Click the Close button (x) at the top right corner of the Speech Debugging Console to close it.
This playback with recorded prompts was probably a more satisfactory experience than the TTS prompts earlier.
Note If you did not record a transcription containing an extraction with "large," which was deliberately omitted in the previous section, the TTS engine will play some prompts. If you did not record it, you should record it now.
To | See |
---|---|
Go to the next step | Adding Semantic Information |
Get more information on Speech Prompt Editor | Prompting the User |
Get more information on entering transcriptions | Entering Transcriptions |
Get more information on validating prompts | Validating Prompts |