Audio48Khz192KBitRateMonoMp3 doesn't work it is always reproducible on 16Khz

Sebastian medina 0 Reputation points
2024-12-11T20:27:18.8133333+00:00

Why my code always download file on 16Khz?

const generateSsml = (text: string, voice: string): string => {
  return `
    <speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-US">
      <voice name="${voice}">
        <mstts:express-as style="cheerful">
          ${text}
        </mstts:express-as>
      </voice>
    </speak>
  `;
};

const textToSpeech = async (
  text: string,
  config: TextToSpeechConfig
): Promise<ArrayBuffer> => {
  const speechConfig = sdk.SpeechConfig.fromSubscription(
    config.key,
    config.region
  );

  if (text.length > 100) {
    //return an error
    return Promise.reject(new Error("Texto demasiado largo"));
  }

  speechConfig.setProperty(
    sdk.PropertyId.SpeechServiceConnection_SynthOutputFormat,
    sdk.SpeechSynthesisOutputFormat.Audio48Khz192KBitRateMonoMp3.toString()
  );

  const synthesizer = new sdk.SpeechSynthesizer(
    speechConfig,
    sdk.AudioConfig.fromDefaultSpeakerOutput()
  );
  const ssml = generateSsml(
    text,
    config?.voice || "en-US-AvaMultilingualNeural"
  );

  return new Promise((resolve, reject) => {
    synthesizer.speakSsmlAsync(
      ssml,
      (result) => {
        if (result.errorDetails) {
          reject(new Error(result.errorDetails));
        }
        const { audioData } = result;
        synthesizer.close();
        resolve(audioData);
      },
      (error) => {
        synthesizer.close();
        reject(error);
      }
    );
  });
};
Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,834 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Saideep Anchuri 585 Reputation points Microsoft Vendor
    2024-12-12T17:00:16.19+00:00

    Hi Sebastian medina

    Welcome to Microsoft Q&A Forum, thank you for posting your query here!

    It seems that you are trying to generate an audio file with a sample rate of 48Khz and a bit rate of 192Kbps in mono MP3 format. It is possible that the Speech Service is not able to generate audio files with the specified format. The supported audio output formats for the Speech Service are:

    • audio-16khz-32kbitrate-mono-mp3
    • audio-24khz-48kbitrate-mono-mp3
    • audio-24khz-96kbitrate-mono-mp3
    • audio-24khz-160kbitrate-mono-mp3
    • audio-48khz-96kbitrate-mono-mp3
    • audio-48khz-192kbitrate-mono-mp3

    It is possible that there is a bug or a limitation in the SDK that causes it to default to 16kHz. Not all voices support all audio formats, so you should ensure that the voice you are using (en-US-AvaMultilingualNeural) supports the Audio48Khz192KBitRateMonoMp3 format. If the selected voice does not support the desired format, the service may default to a lower quality format like 16KHz.

    For more information refer the documentation audio-content-creation

    Thank You.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.