Literal Text Translation and Error Translation

Aashna Joshi 0 Reputation points
2024-10-22T05:57:06.76+00:00

Context: I was working on a Python-based text analyzer and translator app, which takes input in three formats (text, image, and audio)and gives output using 4 Azure services: Azure AI services, speech service, text analytics, and translator services, the code for the same is posted on GitHub. Almost everything is working fine except it takes a few words for their literal translation. for eg. "Hey, my name is Vikas" where Vikas is a Hindi name that also means development but in the current sense, it is the name of a person. it detects the language perfectly when I ask it to translate into hi (Hindi) it simply prints "हे, माई नेम इस विकास" like it merely takes the text in Hinglish format and gives a literal translation of the text. similarly for many Hindi names have another meaning (like Pragati, Khushi, etc), here Vikas can mean development. Still, it confuses the model so much that rather than converting properly it just sends random gibberish. Screenshot 2024-10-22 111510

Steps to Reproduce:

  • Clone and run the code after updating the values in the .env file (the main file is the console-based version, and the app file is the Streamlit-based version). I have attached screenshots of the streamlit-based one.
  • Input the text "Hey, my name is Vikas." The analysis is displayed and prompts if the user wants a translation; type the language code 'hi' for Hindi.
  • Observe the returned output: "हे, माई नेम इस विकास" instead of "अरे, मेरा नाम विकास है."Which is returned correctly. for eg. Hey, My name is Manasvi: "अरे, मेरा नाम मनस्वी है"

Screenshot 2024-10-22 111619

Azure Translator
Azure Translator
An Azure service to easily conduct machine translation with a simple REST API call.
420 questions
Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
426 questions
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 47,026 Reputation points Microsoft Employee
    2024-10-22T08:13:53.1+00:00

    @Aashna Joshi I think this behavior is seen since you are not passing the from language in your request to translate. In this case, the API is detecting the language to be hi-latn and providing the response in hindi as हे, माई नेम इस विकास

    I see the same behavior when I use the REST API without the from language param.

    User's image

    If I use the from param the response is as expected.

    User's image

    I think you need to make a change in your application to use the from param in the call to translate text. Since you already have a method to detect language, you can use that to pass that response to the translate API with the correct language code to avoid this discrepancy.

    I hope this helps!!

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.