Is there a way to permanently add IPA pronunciation to each SubscriptionID without referencing Lexicon files?

Question

In the Text to speech service, I found many words that are pronounced incorrectly, and I solved this by adding phonemes and graphemes to the lexicon file. However, the problem is that each of my service users has different vocabularies and needs to reference their own Lexicon files. This causes delays in adding words and usage since files from my server are being called too frequently, resulting in inconsistency and occasional failures. It would be great if I could compile phonemes from all my users into my SubscriptionID and be able to add more later without having to call the lexicon URI in the API. Is this service currently available?

Accepted Answer

Hello i'm MariOhn,

Welcome to the Microsoft Q&A and thank you for posting your questions here.

The answer provided by @SriLakshmi C is very good to meet your need. This is to enhance the solution with a few additional suggestions. To address the issue of managing custom pronunciations in the Azure Text-to-Speech service without frequent lexicon file references, consider implementing several strategies:

Caching can significantly reduce delays by storing frequently accessed lexicon files locally or in a distributed cache, minimizing the need for repeated API calls, - https://blog.heycoach.in/caching-for-performance-optimization
Batch Processing allows you to group multiple words into a single request, reducing the number of API calls and improving performance - https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/batch
Creating a Custom Endpoint for text pre-processing can streamline the application of phoneme adjustments before sending text to the Text-to-Speech service, enhancing efficiency - https://docs.bentoml.com/en/latest/build-with-bentoml/services.html
Establishing a Feedback Loop where users can report mispronunciations directly can help keep lexicon files updated with user-specific needs - https://www.getpronounce.com/blog/why-is-my-pronunciation-getting-worse
Finally, Monitoring and Logging the performance and usage of lexicon files can help identify bottlenecks and optimize the system - https://www.ibm.com/docs/en/dsm?topic=configuration-honeycomb-lexicon-file-integrity-monitor-fim

This is an example of how you might implement a caching strategy in Python using Redis:

import redis
# Connect to Redis
cache = redis.StrictRedis(host='localhost', port=6379, db=0)
def get_lexicon(word):
    # Check if the word is in the cache
    cached_phoneme = cache.get(word)
    if cached_phoneme:
        return cached_phoneme.decode('utf-8')
    
    # If not in cache, fetch from the lexicon file
    phoneme = fetch_from_lexicon_file(word)
    
    # Store the phoneme in the cache
    cache.set(word, phoneme)
    
    return phoneme
def fetch_from_lexicon_file(word):
    # Simulate fetching phoneme from lexicon file
    lexicon = {'hello': 'həˈloʊ', 'world': 'wɜːrld'}
    return lexicon.get(word, 'unknown')

I hope this is helpful! Do not hesitate to let me know if you have any other questions.

Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

Answer

Hello i'm MariOhn,

Welcome to Microsoft Q&A! Thanks for posting the question.

I understand the challenge you're facing with managing custom pronunciations and the delays caused by frequent calls to external lexicon files. Currently, the Azure Text-to-Speech service requires referencing lexicon files for custom pronunciations, and unfortunately, there isn't a way to permanently store phonemes directly within your Subscription ID.

However, we appreciate you bringing this to our notice and it's something that Microsoft might consider for future improvements. In the meantime, optimizing how lexicon files are accessed or implementing caching strategies could help reduce the delays you're experiencing.

We encourage you to submit a feature request for this functionality through the Azure's post idea community Forum. This helps Microsoft understand the needs of users like you and could contribute to improvements in future updates.

Hope this helps. Do let us know if you any further queries.

If this answers your query, do click Accept Answer and Yes for was this answer helpful.

Thank you!

Share via

Is there a way to permanently add IPA pronunciation to each SubscriptionID without referencing Lexicon files?

1 additional answer

Your answer