1,833 questions with Azure AI Speech tags

Sort by: Updated
1 answer

Accuracy issues reading numbers with Text to speech voices

I wanted your perspective on an issue im having with 2 new voices reading numbers, occasionally Adam Multilingual and Lewis Multilingual will read numbers that start with "four" in a way that sounds like "for five six seven" like…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
Azure
Azure
A cloud computing platform and infrastructure for building, deploying and managing applications and services through a worldwide network of Microsoft-managed datacenters.
1,029 questions
asked 2024-12-20T19:23:41.6233333+00:00
Andrew Silagy 0 Reputation points
answered 2024-12-22T15:38:00.97+00:00
Sina Salam 14,161 Reputation points
3 answers

How to use whisper model to transcribe audio in real time using Speech SDK?

How do I use Whisper model to transcribe microphone input in real-time using Microsoft-cognitiveservices-speech-sdk npm package? I currently have this working and my region is set to northcentralus I want to know how to use Whisper to transcribe in…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-03-30T07:34:45.81+00:00
Nas 0 Reputation points
answered 2024-12-21T05:49:32.9633333+00:00
Meezaan Ryklief👍🙂 0 Reputation points
0 answers

Azure Speech JS SDK Returns Single Item in NBest Array

When using the Cognitive Services JavaScript Speech SDK with OutputFormat.Detailed and the recognizeOnceAsync approach, the NBest array consistently contains only a single object instead of the expected multiple alternatives. For example, when…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-18T13:53:27.55+00:00
Tom D 0 Reputation points
commented 2024-12-20T07:22:41.6333333+00:00
navba-MSFT 26,805 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

Why i get wrong visemes when using German with English phrases?

I am using "Azure Speech" to synthesize speech from a text input, and also to generate Viseme. When using German language, if i use English phrase it sends me back wrong visemes. Ts is not good, last viseme has ts: 0, which should not happen.…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-18T11:52:12.4866667+00:00
Veljko Markovic | Babylon Engineer 20 Reputation points
edited a comment 2024-12-19T20:07:13.07+00:00
Sina Salam 14,161 Reputation points
1 answer

How to access to Whisper in Azure Speech Batch Transcription

When listing base models from API I do not have the whisper option. How do I enable it? I am following this tutorial https://learn.microsoft.com/en-us/azure/ai-services/speech-service/batch-transcription-create?pivots=rest-api#use-a-whisper-model This is…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-17T13:05:04.4933333+00:00
Václav Bílek 0 Reputation points
commented 2024-12-19T07:45:04.39+00:00
romungi-MSFT 48,121 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

What input formats does Batch Speech to Text support?

Are there any documentation or guidelines regarding the input formats supported by the Batch Speech to Text service? I have two mp4 files with different properties; one can be transcribed (bitrate 62kbps, mono, 16000kHz) , while the other cannot…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-19T03:13:30.4533333+00:00
日立s 018 20 Reputation points
commented 2024-12-19T04:57:46.9366667+00:00
navba-MSFT 26,805 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

"Internal server error" from Azure on Filipino speech-to-text

We have been using Azure speech-to-text / transcription services for generating transcripts. Recently (I first noticed this on Monday 9 December 2024) we have seen a very high chance of "Internal server error" in the transcription result of…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-11T22:10:53.2533333+00:00
James Hu 20 Reputation points
accepted 2024-12-18T10:08:15.6866667+00:00
James Hu 20 Reputation points
0 answers

Speechsynthesizer causes JVM crash during finalization (Speech client-sdk for Java)

https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/2701 As I detailed in the above github issue, there is a reproducible bug in the Java client-sdk for Azure Speech. During Finalization of Speechsynthesizer Instance it can cause a…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-17T12:41:19.6966667+00:00
Lucas Mikulla 0 Reputation points
commented 2024-12-18T09:49:15.11+00:00
navba-MSFT 26,805 Reputation points Microsoft Employee
1 answer One of the answers was accepted by the question author.

About speaker separation in "fast-transcription-api"

Dear Azure Support Team https://learn.microsoft.com/en-us/rest/api/speechtotext/transcriptions/transcribe?view=rest-speechtotext-2024-05-15-preview&tabs=HTTP The details of the TranscribeDefinition class are not described anywhere, so how should I do…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-07-29T03:39:03.9533333+00:00
y.ashibe 45 Reputation points
commented 2024-12-18T09:08:29.2066667+00:00
Schuster, Björn 0 Reputation points
2 answers

Inquiry About Azure Speech SDK for Apple Vision OS

Dear Azure Support TeamWe have encountered some challenges and would appreciate your assistance. Here are the details of our inquiry: Current Status: We are currently trying to implement the Azure Speech SDK on Apple Vision OS. Unfortunately, we…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-16T09:34:24.6666667+00:00
XRSPACE-Bowen Huang 0 Reputation points
commented 2024-12-18T04:38:56.0233333+00:00
Saideep Anchuri 585 Reputation points Microsoft Vendor
1 answer One of the answers was accepted by the question author.

Fine-tuning speech-to-text base model for better address recognition

Hello, my team is creating a solution to transcribe addresses with higher accuracy. Our initial benchmarks for using a STT base model for address transcription suggests that it needs to be improved in order to be utilized in a production environment. I…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-12T14:03:18.8133333+00:00
Caesar Cavales 50 Reputation points
accepted 2024-12-17T13:21:38.8133333+00:00
Caesar Cavales 50 Reputation points
1 answer One of the answers was accepted by the question author.

Handling Special Characters in Azure TTS Input

Hello, I’ve noticed that when sending text with special characters like \n (newline) to the Azure Text-to-Speech (TTS) engine, the output is synthesized literally as "backslash n." For now, we’re removing \n before sending the text to the TTS…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-16T15:13:10.15+00:00
Ananth Hegde (anahegde) 20 Reputation points
accepted 2024-12-17T12:20:29.9566667+00:00
Ananth Hegde (anahegde) 20 Reputation points
1 answer

Data upload from Azure Blob Storage for the usage in Speech Studio

Hello everyone, In Speech Studio we can browse for the files locally. Is there a way to upload files from the Blob Storage? Might this functionality be present in Azure AI Foundry in the future? Many thanks in advance for your assistance!

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-11T11:08:25.4666667+00:00
Mariana Logvinenko 20 Reputation points
commented 2024-12-16T09:14:12.2+00:00
santoshkc 11,370 Reputation points Microsoft Vendor
0 answers

Azure Speech - Custom Keyword

I'm trying to generate .table file but I'm not able to get a successfull training. Training model give a "fail" state after more than 48 hours of tranning. the process is: => Speech Studio => Custom KeyWord => New Project => Train…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-14T01:27:43.49+00:00
shahsen 0 Reputation points
commented 2024-12-16T08:11:55.0333333+00:00
romungi-MSFT 48,121 Reputation points Microsoft Employee
0 answers

Speech SDK speech to text segmentation silence timeout is not working as intended

I am using Azure Translation recognizer to recognize and translate text. I am also using the segmentation silence timeout to detect shorter pauses and break sentences as short as possible. According to documentation the supported values are between 100…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-10T11:27:34.7566667+00:00
H M Moniruzzaman 0 Reputation points
commented 2024-12-16T02:00:48.71+00:00
Avinash Devarakonda 445 Reputation points Microsoft Vendor
0 answers

TranslationRecognizer has stopped sending Synthesizing event in the past few days

My existing code (C#) has been using Microsoft.CognitiveServices.Speech SDK (1.41.1) to perform speech translation with voice synthesis successfully in the past 3 months. In the past few days, the Synthesizing event has stopped firing. (Other events…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-11-24T07:56:39.8066667+00:00
Billy Lo 0 Reputation points
commented 2024-12-13T13:37:17.1733333+00:00
Sean Kershaw 5 Reputation points
0 answers

How to Deploy Custom Neural Voice Models Locally (On-Premises) and Obtain Model Image Files

Hello Microsoft Support Team, I have successfully created and trained two Custom Neural Voice (CNV) models in Azure Cognitive Services. Currently, I can leverage these models via the Azure-hosted endpoints, but my goal is to deploy them locally in an…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-11T10:49:34.1666667+00:00
Mohammed Salama 20 Reputation points
commented 2024-12-13T11:34:37.09+00:00
santoshkc 11,370 Reputation points Microsoft Vendor
1 answer

How much time did it take for custom STT model to train?

Hey! 🙂 I've just trained a custom STT model using Azure Speech Services. However, I don't know how much time the train took as I can only see the creation date. Is there any way to check how much time a particular train took? Thanks a lot!

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-06-21T11:32:28.2566667+00:00
Bruno Goncalves Vaz (P) 20 Reputation points
commented 2024-12-13T08:29:21.07+00:00
Scott Clare 0 Reputation points
1 answer

Is it possible to get subtitles or a timed script with batch synthesis text to speech avatar?

Using batch text-to-speach or batch avatar API, is it possible to get subtitles on the generated video? Or even better, getting a script of the text with time stamps. I was hoping to do some front end shenanigans by creating texts highlights, as the…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-09-13T15:07:19.0233333+00:00
d m 5 Reputation points
edited a comment 2024-12-13T07:20:04.3233333+00:00
Pritam Suwal Shrestha 0 Reputation points
1 answer

Audio48Khz192KBitRateMonoMp3 doesn't work it is always reproducible on 16Khz

Why my code always download file on 16Khz? const generateSsml = (text: string, voice: string): string => { return ` <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis"…

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,833 questions
asked 2024-12-11T20:27:18.8133333+00:00
Sebastian medina 0 Reputation points
answered 2024-12-12T17:00:16.19+00:00
Saideep Anchuri 585 Reputation points Microsoft Vendor