你当前正在访问 Microsoft Azure Global Edition 技术文档网站。如果需要访问由世纪互联运营的 Microsoft Azure 中国技术文档网站，请访问 https://docs.azure.cn。

Disclosure design patterns for synthetic voices

项目
12/20/2023

Now that you've determined the right level of disclosure for your text to speech avatar experience, it's a good time to explore potential design patterns.

Overview

There's a spectrum of disclosure design patterns you can apply to your synthetic voice experience. If the outcome of your disclosure assessment was 'High Disclosure', we recommend explicit disclosure, which means communicating the origins of the synthetic voice outright. Implicit disclosure includes cues and interaction patterns that benefit voice experiences whether required disclosure levels are high or low.

Spectrum of disclosure patterns

Category	Examples
Explicit disclosure patterns	Transparent Introduction Verbal Transparent Introduction Explicit Byline Customization and Calibration Parental Disclosure Providing opportunities to learn more about how the voice was made
Implicit disclosure patterns	Capability Disclosure Implicit Cues and Feedback Conversational Transparency

Use the following chart to refer directly to the patterns that apply to your synthetic voice. Some of the other conditions in this chart may also apply to your scenario:

If your synthetic voice experience…	Recommendations	Design patterns
Requires High Disclosure	Use at least one explicit pattern and implicit cues up front to helps users build associations.	Explicit Disclosure Implicit Disclosure
Requires Low Disclosure	Disclosure may be minimal or unnecessary, but could benefit from some implicit patterns.	Capability Disclosure Conversational Transparency
Has a high level of engagement	Build for the long term and offer multiple entry points to disclosure along the user journey. It is highly recommended to have an onboarding experience.	Transparent Introduction Customization and Calibration Capability Disclosure
Includes children as the primary intended audience	Target parents as the primary disclosure audience and ensure that they can effectively communicate disclosure to children.	Parental Disclosure Verbal Transparent Introduction Implicit Disclosure Conversational Transparency
Includes blind users or people with low vision as the primary intended audience	Be inclusive of all users and ensure that any form of visual disclosure has associated alternative text or sound effects. Adhere to accessibility standards for contrast ratio and display size. Use auditory cues to communicate disclosure.	Verbal Transparent Introduction Auditory Cues Haptic Cues Conversational Transparency Accessibility Standards
Is screen-less, device-less or uses voice as the primary or only mode of interaction	Use auditory cues to communicate disclosure.	Verbal Transparent Introduction Auditory Cues
Potentially includes multiple users/listeners (e.g., personal assistant in multiple household)	Be mindful of various user contexts and levels of understanding and offer multiple opportunities for disclosure in the user journey.	Transparent Introduction (Return User) Providing opportunities to learn more about how the voice was made Conversational Transparency

Explicit disclosure

If your synthetic voice experience requires High Disclosure, it's best to use at least one of the following explicit patterns to clearly state the synthetic nature.

Transparent Introduction

Before the voice experience begins, introduce the digital assistant by being fully transparent about the origins of its voice and its capabilities. The optimal moment to use this pattern is when onboarding a new user or when introducing new features to a returning user. Implementing implicit cues during an introduction helps users form a mental model about the synthetic nature of the digital agent.

First-time user experience

Transparent introduction during first run experience
The synthetic voice is introduced while onboarding a new user.

Recommendations

Describe that the voice is artificial (e.g., "digital")
Describe what the agent is capable of doing
Explicitly state the voice's origins
Offer an entry point to learn more about the synthetic voice

Returning user experience

If a user skips the onboarding experience, continue to offer entry points to the Transparent Introduction experience until the user triggers the voice for the first time.

Transparent introduction during return user experience
Provide a consistent entry point to the synthetic voice experience. Allow the user to return to the onboarding experience when they trigger the voice for the first time at any point in the user journey.

Verbal transparent introduction

A spoken prompt stating the origins of the digital assistant's voice is explicit enough on its own to achieve disclosure. This pattern is best for High Disclosure scenarios where voice is the only mode of interaction available.

Verbally spoken transparent introduction
Use a transparent introduction when there are moments in the user experience where you might already introduce or attribute a person's voice.

Verbally spoken transparent introduction in first person
For additional transparency, the voice actor can disclose the origins of the synthetic voice in the first person.

Explicit Byline

Use this pattern if the user will be interacting with an audio player or interactive component to trigger the voice.

Explicit byline in a news media scenario
An explicit byline is the attribution of where voice came from.

Recommendations

Offer entry point to learn more about the synthesized voice

Customization and calibration

Provide users with control over how the digital assistant responds to them (i.e., how the voice sounds). When a user interacts with a system on their own terms and with specific goals in mind, then by definition, they have already understood that it's not a real person.

User Control

Offer choices that have a meaningful and noticeable impact on the synthetic voice experience.

User preferences allow users to customize and improve their experience.

Recommendations

Allow users to customize the voice (e.g., select language and voice type)
Provide users a way to teach the system to respond to their unique voice (e.g., voice calibration, custom commands)
Optimize for user-generated or contextual interactions (e.g., reminders)

Persona Customization

Offer ways to customize the digital assistant's voice. If the voice is based on a celebrity or a widely recognizable person, consider using both visual and spoken introductions when users preview the voice.

Voice customization
Offering the ability to select from a set of voices helps convey the artificial nature.

Recommendations

Allow users to preview the sound of each voice
Use an authentic introduction for each voice
Offer entry points to learn more about the synthesized voice

Parental Disclosure

In addition to complying with COPPA regulations, provide disclosure to parents if your primary intended audience is young children and your exposure level is high. For sensitive uses, consider gaining experience until an adult has acknowledged the use of the synthetic voice. Encourage parents to communicate messages to their children.

Disclosure for parents
A transparent introduction optimized for parents ensures that an adult was made aware of the synthetic nature of the voice before a child interacts with it.

Recommendations

Target parents as the primary audience for disclosure
Encourage parents to communicate disclosure to their children
Offer entry points to learn more about the synthesized voice
Gate the experience by asking parents a simple "safeguard" question to show they have read the disclosure

Providing opportunities to learn more about how the voice was made

Offer context-sensitive entry points to a page, pop-up, or external site that provides more information about synthetic voice technology. For example, you could surface a link to learn more during onboarding or when the user prompts for more information during conversation.

Entry point to learn more
Example of an entry point to offer the opportunity to learn more about the synthesized voice.

Once a user requests more information about the synthetic voice, the primary goal is to educate them about the origins of the synthetic voice and to be transparent about the technology.

Provide users more information about synthetic voice
More information can be offered in an external site help site.

Recommendations

Simplify complex concepts and avoid using legalese and technical jargon
Don't bury this content in privacy and terms of use statements
Keep content concise and use imagery when available

Implicit disclosure

Consistency is the key to achieving disclosure implicitly throughout the user journey. Consistent use of visual and auditory cues across devices and modes of interaction can help build associations between implicit patterns and explicit disclosure.

Consistency of implicit cues

Implicit cues and feedback

Anthropomorphism can manifest in different ways, from the actual visual representation of the agent to the voice, sounds, patterns of light, bouncing shapes, or even the vibration of a device. When defining your persona, leverage implicit cues and feedback patterns rather than aim for a very human-like avatar. This is one way to minimize the need for more explicit disclosure.

Visual cues and feedback
These cues help anthropomorphize the agent without being too human-like. They can also become effective disclosure mechanisms on their own when used consistently over time.

Consider the different modes of interactions of your experience when incorporating the following types of cues:

Category	Examples
Visual Cues	Avatar Responsive real-time cues (e.g., animations) Non-screen cues (e.g., lights and patterns on a device)
Auditory Cues	Sonicon (e.g., a brief distinctive sound, series of musical notes)
Haptic Cues	Vibration

Capability disclosure

Disclosure can be achieved implicitly by setting accurate expectations for what the digital assistant is capable of. Provide sample commands so that users can learn how to interact with the digital assistant and offer contextual help to learn more about the synthetic voice during the early stages of the experience.

Example of default responses to a conversation that you can craft.

Conversational Transparency

When conversations fall in unexpected paths, consider crafting default responses that can help reset expectations, reinforce transparency, and steer users towards successful paths. There are opportunities to use explicit disclosure in conversation as well.

Handling unexpected paths
Off-task or "personal" questions directed to the agent are a good time to remind users of the synthetic nature of the agent and steer them to engage with it appropriately or to redirect them to a real person.

Handling off task questions

When to disclose

There are many opportunities for disclosure throughout the user journey. Design for the first use, second use, nth use…, but also embrace moments of "failure" to highlight transparency—like when the system makes a mistake or when the user discovers a limitation of the agent's capabilities.

Disclosure opportunities throughout a user journey
Example of a standard digital assistant user journey highlighting various disclosure opportunities.

Up-front

The optimal moment for disclosure is the first time a person interacts with the synthetic voice.  In a personal voice assistant scenario, this would be during onboarding, or the first time the user virtually unboxes the experience. In other scenarios, it could be the first time a synthetic voice reads content on a website or the first time a user interacts with a virtual character.

Transparent Introduction
Capability Disclosure
Customization and Calibration
Implicit Cues

Upon request

Users should be able to easily access additional information, control preferences, and receive transparent communication at any point during the user journey when requested.

Providing opportunities to learn more about how the voice was made
Customization and Calibration
Conversational Transparency

Continuously

Use the implicit design patterns that enhance the user experience continuously.

Capability Disclosure
Implicit Cues

When the system fails

Use disclosure as an opportunity to fail gracefully.

Conversational Transparency
Providing opportunities to learn more about how the voice was made
Handoff to human

通过

Disclosure design patterns for synthetic voices

Overview

Explicit disclosure

Transparent Introduction

First-time user experience

Returning user experience

Verbal transparent introduction

Explicit Byline

Customization and calibration

User Control

Persona Customization

Parental Disclosure

Providing opportunities to learn more about how the voice was made

Implicit disclosure

Implicit cues and feedback

Capability disclosure

Conversational Transparency

When to disclose

Up-front

Upon request

Continuously

When the system fails

Additional resources

See also

其他资源