Inconsistencies in IPA Pronunciation in Text to Speech

Chris Enzweiler 0 Reputation points
2024-11-07T16:00:21.8+00:00

Hi,

I'm using SSML to ensure specific pronunciation, however, I'm experiencing some inconsistencies.

For example, here's the word 'would':

<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
      <voice name='en-US-AvaNeural'>
            <phoneme alphabet="ipa" ph="wʊd">would</phoneme>
      </voice>
</speak>

It pronounces the word exactly as expected.

Now if I want to break the word down into individual sounds and just pronounce the 'ʊ' sound, I would use this:

<speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'>
	<voice name='en-US-AvaNeural'>
		<phoneme alphabet="ipa" ph="ʊ">oul</phoneme>
	</voice>            
</speak>

However, now it sounds like it's saying the letter 'O'. I expect that 'ʊ' would be pronounced the same in both cases.

Can anyone offer any insight into why this may be happening? Thank you.

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,772 questions
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.