Handling Special Characters in Azure TTS Input

Ananth Hegde (anahegde) 20 Reputation points
2024-12-16T15:13:10.15+00:00

Hello,

I’ve noticed that when sending text with special characters like \n (newline) to the Azure Text-to-Speech (TTS) engine, the output is synthesized literally as "backslash n." For now, we’re removing \n before sending the text to the TTS engine. However, we’re unsure of the complete list of special characters that might also require sanitization.

Could you provide guidance on how to handle such cases? Is there a way to configure the TTS engine to ignore or handle these characters automatically, or should we continue sanitizing the input text?

Example input: "Sure, I can help you with that. Can you please provide me with the following details: \n1. Departure city \n2. Destination city \n3. Departure date \n4. Return date (if applicable)"

Thank you!

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,835 questions
0 comments No comments
{count} votes

Accepted answer
  1. Pavankumar Purilla 1,965 Reputation points Microsoft Vendor
    2024-12-16T22:34:21.54+00:00

    Hi Ananth Hegde (anahegde),
    Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!
    I understand that you are facing an issue with handling special characters in the Azure Text-to-Speech (TTS) engine.

    The TTS engine does not have a built-in feature to automatically handle special characters like \n in plain text mode. Therefore, sanitizing the input text is recommended. You can remove or replace special characters to ensure smoother speech synthesis.

    In addition to \n, other characters that might require sanitization include \t (tab), \r (carriage return), and \b (backspace). For a complete list of escape sequences, refer to the Python documentation: String and Bytes Literals.
    To sanitize your input, you can use Python's replace() method. For instance, to replace \n with spaces:

    input_text = """Sure, I can help you with that. Can you please provide me with the following details: 
    1. Departure city 
    2. Destination city 
    3. Departure date 
    4. Return date (if applicable)"""
    sanitized_text = input_text.replace('\n', ' ')
    print(sanitized_text)
    
    

    This replaces all occurrences of \n with spaces, ensuring the text reads naturally.

    Alternatively, you can use Speech Synthesis Markup Language (SSML) for more control. For example:

    <speak>
        Sure, I can help you with that. Can you please provide me with the following details:
        <break time="500ms"/>
        1. Departure city
        <break time="500ms"/>
        2. Destination city
        <break time="500ms"/>
        3. Departure date
        <break time="500ms"/>
        4. Return date (if applicable)
    </speak>
    

    SSML ensures pauses or structure like line breaks are rendered as intended. For more on SSML, refer to Azure TTS SSML Documentation.

    Hope this helps. Do let us know if you have any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    1 person found this answer helpful.
    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.