Note
Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.
p and s Elements (Microsoft.Speech)
The p and s elements mark the paragraph and sentence structure of the document.
Syntax
<p xml:lang=”string”> </p>
<s xml:lang="string"> </s>
Attributes
Attribute |
Description |
---|---|
xml:lang |
Optional. Specifies the language to be used for the enclosed text. The value may contain only a lower-case, two-letter language code, (such as "en" for English or "it" for Italian) or may optionally include an upper-case, country/region or other variation in addition to the language code. Examples with a county/region code include "es-US" for Spanish as spoken in the US, or "fr-CA" for French as spoken in Canada. See the Remarks section for additional information. |
Remarks
Use of these elements is optional. The speech engine automatically determines the structure of the document in the absence of these elements. A speech synthesis engine may produce changes in prosody when it encounters a p or s element.
You can use the xml:lang attribute to change the speaking voice to a particular language for the text enclosed by the p or s element. The p and s elements may declare different languages in their xml:lang attributes than the language declared in the speak element. The Speech Platform SDK 11 supports multiple languages in SSML documents.
The Microsoft Speech Platform SDK 11 accepts all valid language-country codes as values for the xml:lang attribute. For a given language code specified in the xml:lang attribute, a Runtime Language for speech synthesis that supports that language code must be installed to correctly pronounce words in the specified language.
If the xml:lang attribute specifies only a language code, (such as "en" for English or "es" for Spanish), and not a country/region code, then any installed speech synthesis voice that expresses support for that generic, region-independent language may produce acceptable pronunciations for words in the specified language. See Language Identifier Constants and Strings for a comprehensive list of language codes.
Note
A voice is an installed Runtime Language for speech synthesis (TTS, or text-to-speech). The Microsoft Speech Platform Runtime 11 and Microsoft Speech Platform SDK 11 do not include any Runtime Languages for speech synthesis. You must download and install a Runtime Language for each language in which you want to generate synthesized speech. A Runtime Language includes the language model, acoustic model, and other data necessary to provision a speech engine to perform speech synthesis in a particular language. See InstalledVoice for more information.
Example
The French number in the following example will be pronounced correctly only if a voice that supports French has been installed.
<?xml version="1.0" encoding="ISO-8859-1"?>
<speak version="1.0"
xmlns="http://www.w3.org/2001/10/synthesis"
xml:lang="en-US">
<p>
<s> Introducing the sentence element. </s>
<s> It may be used to mark individual sentences. </s>
</p>
<p>
Another simple paragraph.
Sentence structure in this paragraph is not explicitly marked.
</p>
<p>
Now we'll change the speaking voice and count in French.
<s xml:lang="fr-FR"> un, deux, trois, quatre </s>
The English-speaking voice returns.
</p>
</speak>