Partager via


Menus and Lists

  Microsoft Speech Technologies Homepage

This topic discusses how menus and lists should be handled in voice user interface (VUI) design.

Menus provide users with a list of choices. In a VUI, these choices are usually limited to navigation between areas of functionality or choices between services.

Because of the linear nature of speech, the number of items in a VUI menu should be limited. Callers will not be able to hold more than three or four options in mind at one time. Lengthy descriptions of menu items may lessen that number because human short-term memory is limited in time. If the description of each choice is too long, it could push other details out of the caller's short-term memory.

For this reason, an average of three choices is advisable. Grouping choices in some memorable way can increase the usable number to four or five. For example, consider this list of five items: "Breakfast, Lunch, Dinner, Sunday Brunch or Midnight Snack," the three items "Breakfast, Lunch, Dinner" are memorable as a group that can be recalled as a single memory item.

An example of a three-item menu:

SYSTEM: Thanks for calling the Tailspin Toys customer hotline, where you can PLACE AN ORDER, CHECK ON YOUR ORDER, or FIND A LOCATION for the Tailspin store nearest you. Tell me which you'd like to do.

Lists

To date, there are no set rules about how lists should be treated in VUI design. There is a handful of strategies, each with their own set of positive and negative issues. The following sections contain a brief summary of some commonly used approaches.

Coasting Lists

A coasting list uses a machine-driven approach to move automatically from item to item without relying on the user to issue commands. Coasting lists are the audio equivalent of a slide show. Typically, a header introduces the list and tells the listener how many items to expect. In most cases, each item begins with marker or signposting phrases that explain their place in the list to the user.

Here is an example of a coasting list:

SYSTEM: I found three items. If you hear the one you want, say YES.
Here's the first one:
[Item 1]
Here's the next one:
[Item 2]
This is the last one:
[Item 3]

Usually, additional navigational instructions follow the list.

SYSTEM: To hear the list again say START OVER. You can stop the list at any time by saying STOP. You can also say NEXT, PREVIOUS, FIRST or LAST. If you hear the item you want, say YES.

This type of list is best for users who want to casually browse a larger list, hearing all the items before making a choice.

One drawback of this scheme is that users may speak YES after the item boundary that prompts their interest. In other words, by the time the user makes a selection, the list prompt has already moved on to the next item. For this reason, the amount of time between spoken items is important.

User-driven Lists

User-driven lists rely on navigational commands from the user to advance from one item to the next. Typically, user-driven lists also begin with a description of the number of items on the list followed by a summary of the first item. Note the placement of the navigational commands after the first item has been read.

SYSTEM: I found three items. If you hear the one you want say YES.
Here's the first one:
[Item 1]
[Brief Pause]
Say NEXT, PREVIOUS, FIRST or LAST to move through the list. If you hear the item you want, say YES.

User-driven lists are best suited to short lists from which users are likely to make a choice on individual items before moving on to the next item. A list of e-mails might be a good candidate for this form of list management.

An advantage to this design is the placement of the help prompt after the first list item. Expert users can barge in and need not wait for instructions before working with the list items.

Numbered Lists

Numbered lists present a group of items, each preceded by a number. This scheme is suited to items that may share identical titles or labels. Imagine browsing for a DVD of a popular film. There are three versions available: the basic disk, the director's cut and a boxed set. Numbers precede each item in the list for differentiation.

For example:

SYSTEM: I found three releases that match the title you're looking for. If you recognize the one you want, say the number. Here's the list:
Number one:
[MovieTitle]
Number two:
[MovieTitle—The Director's Cut]
Number three:
[MovieTitle—The Trilogy, Boxed Set]
If you heard the one you want, say the number. To hear the list again say START OVER.

Adaptation

Adaptive systems change their behavior over time, based on observed values. Typically, tracked data from user interactions causes the system to adapt the prompting style. Adaptation can occur throughout the system, or on a more localized node-by-node basis.

The adaptive opportunities vary depending on the nature of the tracking data and how much is known about individual users. The two key dimensions for adaptation are in-session and cross-session tracking.

In-Session Adaptation

In-session adaptation is the most common form of adaptation in systems today. It can add context to error handling and reflect the number of times a user remains silent, or says something out of grammar. When systems adapt in this way, prompts become progressively more directed as the user repeatedly fails to negotiate a particular dialogue.

In addition, prompts may adapt as the user circles back over the same dialogue. This type of adaptation is called tracked entry. Introductory and instructional prompts are not played the second time through a dialogue or task unless explicitly asked for; therefore context is maintained.

First time through:

SYSTEM: I can notify potential attendees about your meeting proposal. Who should I invite?

Subsequent times through:

SYSTEM: Who else should be there?

Cross-Session Adaptation

Cross-session adaptation is used in personalized systems to shorten prompts as a known user becomes more familiar with the system. Speech systems might tailor prompts and tips based on the number of times a user traverses a particular node. Tips or help prompts are suppressed as the user traverses a particular area of the application. Conversely, they can be designed to come back to life if the user does not visit the node for a long time.

Other uses of this strategy can aid in discoverability. The system can deliver prompts such as "You've been using X for six months and you haven't tried Y. Would you like to know more about it?"

See Also

Dialogue Organization