How to set Speech sensitivity of Speech to Text to ignore all noise.
I need to set the speech sensitivity so that I can change it in noisy enviroments. How to set Speech sensitivity of Speech to Text to ignore all noise.
Issue with Continuous Speech Recognition Returning Omitted Words in Azure Speech Service
Dear Azure Technical Support, I’m using the Azure Speech Service for continuous speech recognition, following the official JavaScript sample from the cognitive-services-speech-sdk repository. I’ve encountered a behavior I’d like to clarify. When using…
Unable to Get Logical Results with Azure Pronunciation Assessment
I'm trying to use the pronunciationAssessment feature in the Azure Speech SDK, but I cannot get reasonable results. Here's the code I ran: using Microsoft.CognitiveServices.Speech; using Microsoft.CognitiveServices.Speech.Audio; using…
IPA phoneme for "Herrera" doesn't sound right
Hi, Here's what I'm using for the IPA phoneme for the Spanish name "Herrera." /eˈreɾa/ However, the first "r" isn't rolled and the second "r" sounds like a T. Is there another phoneme element I can use to get the rolled…
Inconsistencies in IPA Pronunciation in Text to Speech
Hi, I'm using SSML to ensure specific pronunciation, however, I'm experiencing some inconsistencies. For example, here's the word 'would': <speak version='1.0' xmlns='http://www.w3.org/2001/10/synthesis' xml:lang='en-US'> <voice…
Will word boundary event always be triggered before the Synthesizing event?
We are using speech SDK to do text to speech, and we need to highlight the speaking word by leveraging the word boundary event. From…
speech to text twilio telugu transcript is not coming empty transcript and intitally system is not responding
async def receive_json(self, text_data): try: event = text_data.get('event') if event == 'connected': logger.info("WebSocket connected event received") elif event == 'start': …
Bug Report: Mispronunciation of Isolated Hungarian Words in Azure Neural TTS (hu-HU-NoemiNeural), but not in context
Description: The Azure Neural TTS system is mispronouncing specific Hungarian words when using the hu-HU-NoemiNeural voice. The issue affects more than half of the vocabulary words in a recent production run of words (full SSML shared at bottom of this…
How to disable the default "Disfluency Removal" of filler words after STT transcription in Azure AI Speech?
Azure AI Speech Services defaults to removing many filler words (uh, eh, etc.) via post-transcription "Disfluency Removal". My use case includes presentation analysis for filler words, which requires a verbatim transcript. Is there a…
Azure Speech Service Batch Synthesis
Azure Speech Service Batch Synthesis API is not creating the file as MP3 when the output format is correct (audio-24khz-160kbitrate-mono-mp3). Speech is created as WMA file
Can Pronunciation assessment be used with REST API?
Is it possible to utilize Pronunciation assessment with REST API and if so, what are the necessary steps to make it work?
Speech service SDK usage and issues
I am trying to connect the Azure Speech with my Azure OpenAI so that I have the option to use Azure OpenAI to ask queries either by text or voice method. Currently, I have issues with connecting the Azure AI Speech with my backend which is node.js. I am…
Azure TTS Error 404
I get error 404 when trying to fetch the mp3 file via fetch. I am using Node.js in the backend. More details: I created a functionality in my app that creates an XML document containing all SSML tags as specified by Microsoft Azure. Is it possible some…
Issue with Continuous Language Identification in Azure Speech SDK for Angular Application
We are currently using the "microsoft-cognitiveservices-speech-sdk" in our Angular application (version 14) for speech transcription and translation. The transcription and translation functionality is working as expected. However, we are…
Azure Speech Studio Andrew Multilingual voice sounds glitchy
I'm having some issues with the Andrew Multilingual (en-US-AndrewMultilingualNeural) voice in the Azure Speech Studio. There's a few instances in which the voice sounds raspy and really kind of glitchy. It seems to have a lot of trouble with the word…
SpeakSsmlAsync Result always Canceled
Hello, I am building a project using Azure's SpeechSynthesizer. SpeechLog.txt I am running into the following problem: when calling SpeakSsmlAsync(ssmlText), the result always has a canceled state, and I am having a hard time understanding why. When I…
I need to know wether this API "Post-call transcription and analytics" can work with nodejs?
I need to know wether this API "Post-call transcription and analytics" can work with nodejs? If it is not, where I can get a proper Conversation converstion API with multi user and multi language dedection and retrun a text with given…
When using batch speech transscription the ITN feature only applies to the first option of the nBest results.
When using batch transscription the ITN feature only applies to the first option of the nBest results, whitch is not necessarily the one with the highest confidence. The batch transscription service returns a json result with the following structure…
Getting error code 0x38 (SPXERR_AUDIO_SYS_LIBRARY_NOT_FOUND) when deployed to cloud.
I am working on an interactive real time communication that uses both speech synthesizer and recognizer. In development it is working fine but when I deployed through azure web app in a Linux server it was giving the error. I don't want to process any…
Stopping Audio Playback Mid-Stream with Microsoft Neural TTS Service and Speech SDK
I'm working with the Microsoft Neural Text-to-Speech (TTS) service using the Speech SDK. I've successfully implemented audio playback, but I'm facing a challenge with controlling the playback mid-stream. My question is: How can I implement a feature to…