Audio48Khz192KBitRateMonoMp3 doesn't work it is always reproducible on 16Khz

Question

Why my code always download file on 16Khz?

const generateSsml = (text: string, voice: string): string => {
  return `
    
      
        
          ${text}
        
      
    
  `;
};

const textToSpeech = async (
  text: string,
  config: TextToSpeechConfig
): Promise => {
  const speechConfig = sdk.SpeechConfig.fromSubscription(
    config.key,
    config.region
  );

  if (text.length > 100) {
    //return an error
    return Promise.reject(new Error("Texto demasiado largo"));
  }

  speechConfig.setProperty(
    sdk.PropertyId.SpeechServiceConnection_SynthOutputFormat,
    sdk.SpeechSynthesisOutputFormat.Audio48Khz192KBitRateMonoMp3.toString()
  );

  const synthesizer = new sdk.SpeechSynthesizer(
    speechConfig,
    sdk.AudioConfig.fromDefaultSpeakerOutput()
  );
  const ssml = generateSsml(
    text,
    config?.voice || "en-US-AvaMultilingualNeural"
  );

  return new Promise((resolve, reject) => {
    synthesizer.speakSsmlAsync(
      ssml,
      (result) => {
        if (result.errorDetails) {
          reject(new Error(result.errorDetails));
        }
        const { audioData } = result;
        synthesizer.close();
        resolve(audioData);
      },
      (error) => {
        synthesizer.close();
        reject(error);
      }
    );
  });
};

Answer

Hi Sebastian medina

Welcome to Microsoft Q&A Forum, thank you for posting your query here!

It seems that you are trying to generate an audio file with a sample rate of 48Khz and a bit rate of 192Kbps in mono MP3 format. It is possible that the Speech Service is not able to generate audio files with the specified format. The supported audio output formats for the Speech Service are:

audio-16khz-32kbitrate-mono-mp3
audio-24khz-48kbitrate-mono-mp3
audio-24khz-96kbitrate-mono-mp3
audio-24khz-160kbitrate-mono-mp3
audio-48khz-96kbitrate-mono-mp3
audio-48khz-192kbitrate-mono-mp3

It is possible that there is a bug or a limitation in the SDK that causes it to default to 16kHz. Not all voices support all audio formats, so you should ensure that the voice you are using (en-US-AvaMultilingualNeural) supports the Audio48Khz192KBitRateMonoMp3 format. If the selected voice does not support the desired format, the service may default to a lower quality format like 16KHz.

For more information refer the documentation audio-content-creation

Thank You.

Share via

Audio48Khz192KBitRateMonoMp3 doesn't work it is always reproducible on 16Khz

1 answer

Your answer