Issue with Media Playback in Azure Communication Services Using Python

Admin Saad 0 Reputation points
2024-11-23T23:57:11.6+00:00

Context: We are building a bot using Azure Communication Services (ACS) and Azure Speech Services to handle phone calls. The bot uses text-to-speech (TTS) to play questions during calls and captures user responses.

What We’ve Done:

  1. Created an ACS instance and acquired an active phone number.
  2. Set up an event subscription to handle the callback for incoming calls.
  3. Integrated Azure Speech Services for TTS using Python.

Achievements:

  • Successfully connected calls using ACS.
  • Generated TTS audio files for trial questions.

Challenges: Converted TTS audio files are not playing during the call. The playback method does not raise errors, but no audio is heard on the call.

Code Snippet (Python):

def play_audio(call_connection_id, audio_file_path):
    try:
        audio_url = f"http://example.com/{audio_file_path}"  # Publicly accessible URL
        call_connection = call_automation_client.get_call_connection(call_connection_id)
        file_source = FileSource(url=audio_url)
        call_connection.play_media(play_source=file_source, play_to=True)
        print(f"Playing audio: {audio_url}")
    except Exception as e:
        print(f"Error playing audio: {e}")

Help Needed:

  1. Are there specific requirements for media playback using the ACS SDK for Python?
  2. How can we debug why the audio is not playing despite being hosted on a public URL?

Additional Context:

  • Using Python 3.12.6 and the Azure Communication Services Python SDK.
  • The audio files are hosted on a local server and accessible via public URLs.

Steps Followed:

  1. Caller Initiates a Call: Someone calls the phone number linked to my ACS resource.
  2. ACS Sends an Incoming Call Event: ACS sends a Microsoft.Communication.IncomingCall event to my /calling-events endpoint.
  3. Application Answers the Call: My Flask app receives the event and answers the call using the incomingCallContext.
  4. Call Connected Event: Once the call is established, ACS sends a Microsoft.Communication.CallConnected event.
  5. Start Interaction: I start the conversation by playing a welcome message to the caller.
  6. Play Audio Messages
    1. The excel question text gets converted to speech using Azure text to speech API from Azure speech service
    2. This converted speech is stored as .wav files
    3. These .wav files need to be hosted on a publicly accessible URL so that the ACS can access them and play it on call
  7. Handle User Input: After the question is played, If speech recognition is implemented, the bot listens for and processes the caller's speech input.
  8. End the Call: After the conversation, the bot plays a goodbye message and hangs up.
  9. Clean Up: The bot handles the CallDisconnected event to clean up any resources or state.

 

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,911 questions
Azure Communication Services
Azure Communication Services
An Azure communication platform for deploying applications across devices and platforms.
1,003 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Sina Salam 17,646 Reputation points
    2024-11-25T10:07:55.27+00:00

    Hello Admin Saad,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you are having Issue with Media Playback in Azure Communication Services Using Python.

    Kindly follow these eight steps to resolve the issues:

    1. Azure Communication Services supports certain audio formats you will need to double-check:
    • That the format use WAV files with PCM encoding or MP3.
    • Encoding audio file is encoded in a supported codec. For WAV, use Linear PCM with 16-bit depth, and the
    • Sample Rate is the recommended sample rate is 16 kHz.
    • Use tools like FFmpeg - https://ffmpeg.org/ to verify or convert the audio format using bash:

    ffmpeg -i input.wav -ar 16000 -acodec pcm_s16le output.wav

    1. Ensure the hosted file is:
    • Publicly accessible with no authentication required.
    • Hosted on a server that serves audio files with correct MIME types (audio/wav or audio/mp3).

    Test the URL:

    • Open it in a browser.
    • Check server headers using tools like curl: curl -I http://example.com/audio.wav
    1. The ACS Python SDK expects asynchronous operations for call media playback. Use the play_to_all parameter to ensure the audio plays for all participants:
    from azure.communication.callautomation import FileSource
    async def play_audio(call_connection_id, audio_file_path):
        try:
            audio_url = f"http://example.com/{audio_file_path}"  # Public URL
            call_connection = call_automation_client.get_call_connection(call_connection_id)
            file_source = FileSource(url=audio_url)
            # Asynchronously play the media
            await call_connection.play_media(play_source=file_source, play_to_all=True)
            print(f"Playing audio: {audio_url}")
        except Exception as e:
            print(f"Error playing audio: {e}")
    
    1. Create a debugging for playbacks, if audio does not play:
    • Use detailed logs from both the bot and ACS.
    • Make sure the CallConnected and PlayCompleted events are correctly handled.
    • Check if the PlayCompleted event is triggered by ACS.
    1. Before uploading the file to a public server test it:
    • Use tools like VLC or Audacity to confirm the file plays correctly.
    • Make sure the file has no corruption or encoding issues.
    1. If using a local server:
    • Use a reliable and scalable hosting service like Azure Blob Storage.
    • Configure Blob Storage with a public access level using bash command: az storage blob set-permission --container-name <container> --account-name <account> --public-access container
    1. Update the Python SDK and Dependencies using bash: pip install --upgrade azure-communication-callautomation
    2. Use ACS diagnostic tools to record audio playback and identify issues - https://learn.microsoft.com/en-us/azure/communication-services/resources/troubleshooting/voice-video-calling/references/how-to-collect-diagnostic-audio-recordings

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.