Partilhar via


Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

phoneme Element PLS (Microsoft.Speech)

Contains the phonetic spelling that describes how a lexeme is pronounced.

Syntax

<phoneme
   prefer = "true" | “false”
   alphabet = “ipa” | “x-microsoft-sapi” | “x-microsoft-ups”
</phoneme>

Attributes

Attribute

Description

prefer

Optional. The possible values are: true or false. The default value is false.

alphabet

Optional. Specifies the pronunciation alphabet used to construct the contained phonetic spelling. The only acceptable values are ipa or x-microsoft-sapi or x-microsoft-ups. These values are case-sensitive and must be entered in lower case.

The pronunciation alphabet specified applies only to the containing phoneme, overriding the value of the alphabet attribute in the lexicon element.

Remarks

A lexeme element may contain multiple phoneme elements. A speech recognition engine will recognize all pronunciations specified in phoneme elements. A speech syntheses engine can speak only one pronunciation and must make a selection if multiple pronunciations are present, using one of the following strategies:

  • If only one pronunciation has prefer set to true, the speech synthesis engine selects it.

  • If more than one pronunciation has prefer set to true, the speech synthesis engine selects the one listed first in the document.

  • If none of multiple pronunciations has prefer set to true, the speech synthesis engine selects the one listed first in the document.

  • Alternatively, a speech synthesis engine may have an internal mechanism that selects pronunciations according to another strategy.

Examples

In the following example, "read" has two pronunciations. A speech recognition engine will recognize both pronunciations, whereas a speech synthesis engine will only use one. Since none of the pronunciations has prefer set to true, unless the speech synthesis engine mandates a different strategy, it will use the first pronunciation in document order, in this case “S1 R EH D”.

Note that the phones in the phoneme element are case-sensitive and must be space-delimited. See Phonetic Alphabet Reference (Microsoft.Speech) for more information. Each phoneme element can contain only one pronunciation.

<?xml version="1.0" encoding="UTF-8"?>
<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet=“x-microsoft-ups” xml:lang="en-US">

  <lexeme>
    <grapheme> read </grapheme>
    <phoneme> S1 R EH D </phoneme>
    <phoneme> S1 R I D </phoneme>
  </lexeme>

</lexicon>

The following example illustrates the use of the alphabet attribute of the phoneme element to specify two pronunciations for a single instance of the word “lead”, using a different phonetic alphabet than the one specified in the lexicon element.

<?xml version="1.0" encoding="UTF-8"?>
<lexicon version="1.0" 
      xmlns="http://www.w3.org/2005/01/pronunciation-lexicon"
      xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
      xsi:schemaLocation="http://www.w3.org/2005/01/pronunciation-lexicon 
        http://www.w3.org/TR/2007/CR-pronunciation-lexicon-20071212/pls.xsd"
      alphabet="ipa" xml:lang="en-US">
  <lexeme>
    <grapheme>lead</grapheme>
    <phoneme alphabet=“x-microsoft-sapi”> 1 l iy d</phoneme>
    <phoneme alphabet=“x-microsoft-sapi” prefer=”true”> 1 l eh d</phoneme>
  </lexeme>
</lexicon>

In the example above, a speech recognition engine will recognize both pronunciations. A speech synthesis engine can use only one pronunciation, in this case the second pronunciation listed because it has prefer set to true.

See Also

Concepts

Phonetic Alphabet Reference (Microsoft.Speech)

About Lexicons and Phonetic Alphabets (Microsoft.Speech)