Hi Waleed,
Welcome to Microsoft Q&A forum. Thank you for posting your query.
I understood that your issue involves inconsistent pronunciation assessment scores when using Azure Speech Pronunciation Assessment API with different SDKs:
Python SDK (azure.cognitiveservices.speech)
TypeScript SDK (microsoft-cognitiveservices-speech-sdk)
For the same audio file and the same reference text, you are getting significantly different results
The FluencyScore is similar, but other metrics (Accuracy, Completeness, Pronunciation) show extreme discrepancies.
Possible Causes & Solutions:
Check Pronunciation Assessment Configuration
Ensure that the parameters used in both SDKs are identical.
Key parameters to verify:
Grading System (100-point scale vs. 5-point scale)
Phoneme-level vs. Word-level assessment
Enable miscue analysis (missing words detection)
Granularity (Phoneme, Word, FullText)
Solution: Print/log the exact JSON payload sent in both Python and TypeScript to compare configurations.
Audio File Encoding Issues
Ensure that the audio file is properly formatted before being sent to the API.
The API expects a specific format (e.g., 16-bit PCM, 16kHz, mono).
TypeScript might be processing or encoding the audio differently.
Solution: Convert the audio to a standard format before sending it. Compare the byte size of the file when loaded in both languages.
SDK Version Differences
Different versions of SDKs might implement different scoring models.
Check if your Python and TypeScript SDKs are updated to the latest version.
Solution: Log and compare the locale configuration in both implementations.
Check if TypeScript is sending extra silence/noise.
Match the pronunciation models (en-US, en-GB).
Hope this helps. Do let us know if you any further queries.
-------------
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful.
Thank you.