Why are there differences between ASR performance of Whisper(HuggingFace) vs OpenAI Whisper difference?

Question

Hi,

In my experience of using Whisper via OpenAI Azure services and comparing ASR performance (for same dataset) with Whisper large-v3 via HuggingFace.

I have seen that OpenAI Azure Whisper performs consistently better across different metrics like WER and even semantic similarity scores like BERTScore.

Is there any reason for this? Perhaps Azure has the latest version of OpenAI that is more performant than its open-source counterpart Whisper large-v3.

Answer

Hi Shamus Sim

It is showing version 1 of Whisper is Whisper Large v2 model.

So, there is a model architecture difference for version 1 as mentioned in hugging face model card details

Reference is model card details from Azure OpenAI foundry

Screenshot 2025-02-27 215030

Azure product team also does additional finetuning before hosting the models in Datacenter which might be reason behind the better results.

https://github.com/openai/whisper/blob/main/model-card.md

Hope that helps.

Thank you.

Share via

Why are there differences between ASR performance of Whisper(HuggingFace) vs OpenAI Whisper difference?

1 answer

Your answer