Azure OpenAI Whisper hallucinates source audio language

Shamus Sim 0 Reputation points
2025-02-26T06:38:39+00:00

Hi,

I have been experimenting with the Whisper service via Azure OpenAI services.

There are alot of cases where the audio language send to the API is (clealry) English but it comes out as a Malay(Bahasa Malaysia) text, despite clearly using the transcription API and not the translation API.

Sample prompt and API call

            `transcription = client_tts.audio.transcriptions.create(`
````                    model="whisper",`

`                    file=audio_file,`

`                    prompt="""Transcribe the following audio file, it is taken in medical clinic setting, `

`                    return the original language and the transcription. Only return English, Bahasa Malaysia, and Mandarin language"""`

`                )`
Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,733 questions
{count} votes

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.