Real-time Transcription in ACS Group video calls

Madushika Shiromani 0 Reputation points
2025-01-10T05:09:32.3066667+00:00

have been working on implementing the Azure Communication Services (ACS) real-time transcription feature and followed the steps outlined in the: https://learn.microsoft.com/en-us/azure/communication-services/how-tos/call-automation/real-time-transcription-tutorial?pivots=programming-language-csharp

Here is the workflow I am following:

  1. From the frontend, I capture the serverCallId when starting a call using the Call Composite.
  2. I pass the serverCallId to my backend API and invoke the StartTranscription method.
  3. I have set up a WebSocket server to receive the transcribed data in real time.

The StartTranscription method successfully returns a 202 response with a valid CallConnectionId. (Eg: "16008280-6e77-4226-ac14-1d1235dffe21"). However, when I attempt to stop the transcription using the CallConnectionId in the following code: I encounter the below error:

[HttpPost("start-tanscription")]
public async Task<IActionResult> StartTranscription([FromBody] StartRecordingRequest request)
{
    try
    {
        var serverCallId = new ServerCallLocator(request.ServerCallId);
        var websocketUri = callbackUriHost.Replace("https", "wss") + "ws";
        _logger.LogInformation($"Callback url: {callbackUri}, websocket Url: {websocketUri}");

        var callInvite = new GroupCallLocator(request.GroupCallId);
        var connectOptions = new ConnectCallOptions(callInvite, callbackUri)
        {
            CallIntelligenceOptions = new CallIntelligenceOptions()
            {
                CognitiveServicesEndpoint = new Uri(_cognitiveServicesEndpoint)
            },
            TranscriptionOptions = new TranscriptionOptions(
                new Uri(websocketUri),
                "en-US", 
                false, 
                TranscriptionTransport.Websocket 
            ),
        };
        var createCallResultResponse = _callAutomationClient.ConnectCall(connectOptions);
        var createCallResult = createCallResultResponse.Value;
        var callConnection = _callAutomationClient.GetCallConnection(createCallResult.CallConnection.CallConnectionId);
        var callState = callConnection.GetCallConnectionProperties().Value.CallConnectionState;

        var callMedia = callConnection.GetCallMedia();
        StartTranscriptionOptions startTrasnscriptionOption = new StartTranscriptionOptions()
        {
            Locale = "en-US",
            OperationContext = "startMediaStreamingContext"
        };
        
        await callMedia.StartTranscriptionAsync(startTrasnscriptionOption);
        _logger.LogInformation("Real-time transcription started...");
        
        _callAutomationClient.GetEventProcessor().AttachOngoingEventProcessor<TranscriptionFailed>(
        createCallResult.CallConnection.CallConnectionId, async (TranscriptionFailed) =>
        {
            _logger.LogInformation($"Received transcription event: {TranscriptionFailed.GetType()}, CorrelationId: {TranscriptionFailed.CorrelationId}, " +
                $"SubCode: {TranscriptionFailed?.ResultInformation?.SubCode}, Message: {TranscriptionFailed?.ResultInformation?.Message}");
        });
        return Ok(new { CallConnectionId = createCallResult.CallConnection.CallConnectionId });
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, "Error occurred while starting the recording.");
        return BadRequest(new { error = ex.Message });
    }
}

This is returns 202 with CallConnectionId successfully. But when I'm going to Stop transcription with CallConnectionID it getting below error.

[HttpPost("stop-transcription")]
public async Task<IActionResult> StopTranscription([FromBody] StopRecordingRequest request)
{
    try
    {
        var callConnection = _callAutomationClient.GetCallConnection(request.CallConnectionId);
        var callMedia = callConnection.GetCallMedia();
        StopTranscriptionOptions stopOptions = new StopTranscriptionOptions()
        {
            OperationContext = "stopTranscription"
        };
        await callMedia.StopTranscriptionAsync(stopOptions);
        return Ok(new { message = "Transcription stopped & transcription stopped" });
    }
    catch (Exception ex)
    {
        _logger.LogError(ex, "Error occurred while stopping the recording & transcription.");
        return BadRequest(new { error = ex.Message });
    }
}
{ "error": "Invalid action, Transcription is not active.\r\nStatus: 412 (Precondition Failed)\r\nErrorCode: 8583\r\n\r\nContent:\r\n{\"error\":{\"code\":\"8583\",\"message\":\"Invalid action, Transcription is not active.\"}}\r\n\r\nHeaders:\r\nDate: Thu, 09 Jan 2025 20:47:25 GMT\r\nConnection: keep-alive\r\nX-Microsoft-Skype-Client: REDACTED\r\nx-ms-client-request-id: 0576b522-c569-4a9c-ad8e-db71ff4023fe\r\nX-Microsoft-Skype-Chain-ID: REDACTED\r\nx-azure-ref: REDACTED\r\nStrict-Transport-Security: REDACTED\r\nX-Cache: REDACTED\r\nContent-Type: application/json; charset=utf-8\r\nContent-Length: 82\r\n" }

It seems the transcription is not active when I try to stop it, but I am unsure why. Could you please help me identify the issue? Am I missing any steps or configurations?

Azure Communication Services
Azure Communication Services
An Azure communication platform for deploying applications across devices and platforms.
959 questions
C#
C#
An object-oriented and type-safe programming language that has its roots in the C family of languages and includes support for component-oriented programming.
11,185 questions
0 comments No comments
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.