Hi Thierry Tropée,
Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!
To prevent unwanted Kanji transcription of names when using Azure Speech-to-Text for Japanese batch transcription, you can leverage a combination of custom pronunciation dictionaries, custom speech models, and post-processing techniques.
Initially creating a custom pronunciation dictionary allows you to specify how names should be transcribed, ensuring that names like "ドイ" remain in Katakana rather than being converted into Kanji such as "土井" or "土居." This can be done in Azure Speech Studio by adding pronunciation rules that explicitly map names to their desired form.
If pronunciation dictionaries alone do not resolve the issue, training a custom speech model using labeled audio data where names are consistently written in Katakana can help. This involves collecting training data, uploading correctly transcribed text, and training a model to reinforce the desired transcription format.
I hope this information helps.