Azure speech to text appears very slow
Hi team,
We have observed that the Azure speech-to-text is very slow. I am using continuousRecognitionAsync and I observe that Azure takes a total of close to 6s for just 3s audio.
The parameters that I've set are:
EndSilenceTimeoutMs = 750
InitialSilenceTimeoutMs = 180000
The language set is en-US
I am using the standard tier and this latency feels extremely huge. Is this expected or can we make improvements to the parameters/model to get acceptable latency?
A few observations that appear as concerns:
- Speech start detected takes close to 1.2s
- From speech start detected, the model takes 3 more seconds to give the last recognizing response
- From the last recognising response, it takes almost 2 seconds to give the recognized response.
Is continuousRecognitionAsync expected to take this much time? Please suggest some best practices for getting faster responses from Azure in real-time streaming.
Regards,
Sai Vishnu Soudri