你当前正在访问 Microsoft Azure Global Edition 技术文档网站。 如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站,请访问 https://docs.azure.cn

Disclosure design patterns for synthetic voices

Now that you've determined the right level of disclosure for your text to speech avatar experience, it's a good time to explore potential design patterns.

Overview

There's a spectrum of disclosure design patterns you can apply to your synthetic voice experience. If the outcome of your disclosure assessment was 'High Disclosure', we recommend explicit disclosure, which means communicating the origins of the synthetic voice outright. Implicit disclosure includes cues and interaction patterns that benefit voice experiences whether required disclosure levels are high or low.

Spectrum of disclosure patterns

Category Examples
Explicit disclosure patterns
Implicit disclosure patterns

Use the following chart to refer directly to the patterns that apply to your synthetic voice. Some of the other conditions in this chart may also apply to your scenario:

If your synthetic voice experience… Recommendations Design patterns
Requires High Disclosure Use at least one explicit pattern and implicit cues up front to helps users build associations.
Requires Low Disclosure Disclosure may be minimal or unnecessary, but could benefit from some implicit patterns.
Has a high level of engagement Build for the long term and offer multiple entry points to disclosure along the user journey. It is highly recommended to have an onboarding experience.
Includes children as the primary intended audience Target parents as the primary disclosure audience and ensure that they can effectively communicate disclosure to children.
Includes blind users or people with low vision as the primary intended audience Be inclusive of all users and ensure that any form of visual disclosure has associated alternative text or sound effects. Adhere to accessibility standards for contrast ratio and display size. Use auditory cues to communicate disclosure.
Is screen-less, device-less or uses voice as the primary or only mode of interaction Use auditory cues to communicate disclosure.
Potentially includes multiple users/listeners (e.g., personal assistant in multiple household) Be mindful of various user contexts and levels of understanding and offer multiple opportunities for disclosure in the user journey.

Explicit disclosure

If your synthetic voice experience requires High Disclosure, it's best to use at least one of the following explicit patterns to clearly state the synthetic nature.

Transparent Introduction

Before the voice experience begins, introduce the digital assistant by being fully transparent about the origins of its voice and its capabilities. The optimal moment to use this pattern is when onboarding a new user or when introducing new features to a returning user. Implementing implicit cues during an introduction helps users form a mental model about the synthetic nature of the digital agent.

First-time user experience

Transparent introduction during first run experience
The synthetic voice is introduced while onboarding a new user.

Recommendations

  • Describe that the voice is artificial (e.g., "digital")
  • Describe what the agent is capable of doing
  • Explicitly state the voice's origins
  • Offer an entry point to learn more about the synthetic voice

Returning user experience

If a user skips the onboarding experience, continue to offer entry points to the Transparent Introduction experience until the user triggers the voice for the first time.

Transparent introduction during return user experience
Provide a consistent entry point to the synthetic voice experience. Allow the user to return to the onboarding experience when they trigger the voice for the first time at any point in the user journey.

Verbal transparent introduction

A spoken prompt stating the origins of the digital assistant's voice is explicit enough on its own to achieve disclosure. This pattern is best for High Disclosure scenarios where voice is the only mode of interaction available.

Verbally spoken transparent introduction
Use a transparent introduction when there are moments in the user experience where you might already introduce or attribute a person's voice.

Verbally spoken transparent introduction in first person
For additional transparency, the voice actor can disclose the origins of the synthetic voice in the first person.

Explicit Byline

Use this pattern if the user will be interacting with an audio player or interactive component to trigger the voice.

Explicit byline in a news media scenario
An explicit byline is the attribution of where voice came from.

Recommendations

  • Offer entry point to learn more about the synthesized voice

Customization and calibration

Provide users with control over how the digital assistant responds to them (i.e., how the voice sounds). When a user interacts with a system on their own terms and with specific goals in mind, then by definition, they have already understood that it's not a real person.

User Control

Offer choices that have a meaningful and noticeable impact on the synthetic voice experience.

User preferences
User preferences allow users to customize and improve their experience.

Recommendations

  • Allow users to customize the voice (e.g., select language and voice type)
  • Provide users a way to teach the system to respond to their unique voice (e.g., voice calibration, custom commands)
  • Optimize for user-generated or contextual interactions (e.g., reminders)

Persona Customization

Offer ways to customize the digital assistant's voice. If the voice is based on a celebrity or a widely recognizable person, consider using both visual and spoken introductions when users preview the voice.

Voice customization
Offering the ability to select from a set of voices helps convey the artificial nature.

Recommendations

  • Allow users to preview the sound of each voice
  • Use an authentic introduction for each voice
  • Offer entry points to learn more about the synthesized voice

Parental Disclosure

In addition to complying with COPPA regulations, provide disclosure to parents if your primary intended audience is young children and your exposure level is high. For sensitive uses, consider gaining experience until an adult has acknowledged the use of the synthetic voice. Encourage parents to communicate messages to their children.

Disclosure for parents
A transparent introduction optimized for parents ensures that an adult was made aware of the synthetic nature of the voice before a child interacts with it.

Recommendations

  • Target parents as the primary audience for disclosure
  • Encourage parents to communicate disclosure to their children
  • Offer entry points to learn more about the synthesized voice
  • Gate the experience by asking parents a simple "safeguard" question to show they have read the disclosure

Providing opportunities to learn more about how the voice was made

Offer context-sensitive entry points to a page, pop-up, or external site that provides more information about synthetic voice technology. For example, you could surface a link to learn more during onboarding or when the user prompts for more information during conversation.

Entry point to learn more
Example of an entry point to offer the opportunity to learn more about the synthesized voice.

Once a user requests more information about the synthetic voice, the primary goal is to educate them about the origins of the synthetic voice and to be transparent about the technology.

Provide users more information about synthetic voice
More information can be offered in an external site help site.

Recommendations

  • Simplify complex concepts and avoid using legalese and technical jargon
  • Don't bury this content in privacy and terms of use statements
  • Keep content concise and use imagery when available

Implicit disclosure

Consistency is the key to achieving disclosure implicitly throughout the user journey. Consistent use of visual and auditory cues across devices and modes of interaction can help build associations between implicit patterns and explicit disclosure.

Consistency of implicit cues

Implicit cues and feedback

Anthropomorphism can manifest in different ways, from the actual visual representation of the agent to the voice, sounds, patterns of light, bouncing shapes, or even the vibration of a device. When defining your persona, leverage implicit cues and feedback patterns rather than aim for a very human-like avatar. This is one way to minimize the need for more explicit disclosure.

Visual cues and feedback
These cues help anthropomorphize the agent without being too human-like. They can also become effective disclosure mechanisms on their own when used consistently over time.

Consider the different modes of interactions of your experience when incorporating the following types of cues:

Category Examples
Visual Cues
  • Avatar
  • Responsive real-time cues (e.g., animations)
  • Non-screen cues (e.g., lights and patterns on a device)
Auditory Cues
  • Sonicon (e.g., a brief distinctive sound, series of musical notes)
Haptic Cues
  • Vibration

Capability disclosure

Disclosure can be achieved implicitly by setting accurate expectations for what the digital assistant is capable of. Provide sample commands so that users can learn how to interact with the digital assistant and offer contextual help to learn more about the synthetic voice during the early stages of the experience.

Example of default responses to a conversation that you can craft.

Conversational Transparency

When conversations fall in unexpected paths, consider crafting default responses that can help reset expectations, reinforce transparency, and steer users towards successful paths. There are opportunities to use explicit disclosure in conversation as well.

Handling unexpected paths
Off-task or "personal" questions directed to the agent are a good time to remind users of the synthetic nature of the agent and steer them to engage with it appropriately or to redirect them to a real person.

Handling off task questions

When to disclose

There are many opportunities for disclosure throughout the user journey. Design for the first use, second use, nth use…, but also embrace moments of "failure" to highlight transparency—like when the system makes a mistake or when the user discovers a limitation of the agent's capabilities.

Disclosure opportunities throughout a user journey
Example of a standard digital assistant user journey highlighting various disclosure opportunities.

Up-front

The optimal moment for disclosure is the first time a person interacts with the synthetic voice.  In a personal voice assistant scenario, this would be during onboarding, or the first time the user virtually unboxes the experience. In other scenarios, it could be the first time a synthetic voice reads content on a website or the first time a user interacts with a virtual character.

Upon request

Users should be able to easily access additional information, control preferences, and receive transparent communication at any point during the user journey when requested.

Continuously

Use the implicit design patterns that enhance the user experience continuously.

When the system fails

Use disclosure as an opportunity to fail gracefully.

Additional resources

See also