How to Ensure GPT-4 (Azure OpenAI) Includes Original Terms in Parentheses During Translation

Mohamed Darwish 0 Reputation points
2025-01-26T13:11:45.1133333+00:00

I'm using the Azure OpenAI Service with the GPT-4 model (GPT-4o mini) to create a bot for translating psychoanalysis articles. The translations need to preserve specific technical terms from the source text by including them in parentheses alongside the translation. For instance:

Input:

Es ist bekanntlich die Absicht der analytischen Arbeit, den Patienten dahin zu bringen, daß er die Verdrängungen – im weitesten Sinne verstanden – seiner Frühentwicklung wieder aufhebe...

Desired output:

It is well known that the aim of analytical work is to bring the patient to a state in which they can lift repression (Verdrängungen) – understood in the broadest sense – of their early development...

Context

I’ve deployed the model as part of a Contoso web app and uploaded a glossary of technical terms (e.g., Verdrängungen, Affektregungen) along with sample translations to guide the model. The goal is to ensure consistent and accurate use of these terms in parentheses within the translations.

Challenges

When running the deployed model, the translations often:

Omit the original technical terms in parentheses.

Misplace or incorrectly format the terms, despite providing explicit instructions.

I tried creating a bot with the same prompts in the ChatGPT chat interface (outside Azure) and found that the translations were better, with terms consistently appearing in parentheses. However, in the Azure-deployed version, the output is less consistent, even with the same prompts and glossary.

What I’ve Tried

  1. Parameter Adjustments: Set temperature to 0.2 and top_p to 0.6 for deterministic results.
  2. Explicit Prompting: Included clear
    instructions like: "Translate the text while retaining technical
    
    terms in parentheses alongside the translated text."
    
  3. Few-Shot
    Examples: Added sample translations to the prompt: vbnet Copy Edit
    
    Example:  
    
    **German: Der Begriff "Verdrängung" ist zentral in der Psychoanalyse.   English: The term "repression" (Verdrängung) is
    
    central in psychoanalysis.**
    
  4. Uploading Sample Translations: Provided
    a list of example translations alongside the glossary for additional
    
    context. Questions
    

Why is there a difference in translation quality between the Azure OpenAI deployment and the ChatGPT chat interface using the same prompts?

Are there specific adjustments I can make (e.g., prompt design, fine-tuning, or parameter tweaks) to ensure the model reliably includes original terms in parentheses?

Would creating a custom fine-tuned model for this use case (e.g., integrating the glossary) improve performance?

Any insights or suggestions on how to optimize the model for this translation workflow would be greatly appreciated!

Azure OpenAI Service
Azure OpenAI Service
An Azure service that provides access to OpenAI’s GPT-3 models with enterprise capabilities.
3,616 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Manas Mohanty (Quadrant Resource LLC) 215 Reputation points Microsoft Vendor
    2025-01-28T06:52:22.2166667+00:00

    Hi Mohamed Darwish!

    Welcome to Microsoft Q&A Forum, thank you for posting your query here.

    It seems you are facing difficulties prompting the model to get exact answer.

    I have tried below prompt along with you sample inputs to get desired output.

    """""

    from User:

    Please help me translate few two-line Germany statements in a manner such a way that all words within open bracket shall stay along translated word.

    for e.g

    (Ich) have heute viel Arbeit. Wir gehen am (Wochenende) ins Kino.

    (I) have today much work. We go on the (weekend) to the cinema.

    Assistant:

    Sure, I'd be happy to help. Could you please provide the German statements you'd like me to translate?

    From User

    Deutschland ist bekannt für seine [reichhaltige] Geschichte und sein [vielfältiges] kulturelles Erbe.

    Die [malerischen] Landschaften und [innovative] Städte machen Deutschland zu einem beliebten Reiseziel.

    Assistant:

    Germany is known for its [rich] history and its [diverse] cultural heritage.

    The [picturesque] landscapes and [innovative] cities make Germany a popular travel destination.

    In SDK

    You can input a file now to split into statements and get the desired translation with above chat_prompt

     chat_prompt = [
            {
                "role": "system",
                "content": "You are an AI assistant that helps people find information."
            },
            {
                "role": "user",
                "content": "Please help me translate few two line Germany statements in a manner such a way that all words within openbracket shall stay along translated word. for e.g (Ich) have heute viel Arbeit. Wir gehen am (Wochenende) ins Kino.\n(I) have today much work. We go on the (weekend) to the cinema."
            }
        ]
    

    SDK Reference guide

    """"

    Feel free to rephrase above prompt with sample inputs.

    Hope this helps. Do let us know if you any further queries.


    If this answers your query, do click Accept Answer and Yes for was this answer helpful.

    Thank you.

    0 comments No comments

  2. Sina Salam 17,016 Reputation points
    2025-01-29T03:51:51.4266667+00:00

    Hello Mohamed Darwish,

    Welcome to the Microsoft Q&A and thank you for posting your questions here.

    I understand that you would like to know how you can be sure GPT-4 (Azure OpenAI) Includes Original Terms in Parentheses During Translation.

    Regarding your scenario and questions, the ChatGPT web interface likely uses a more sophisticated system prompt that enforces structured responses. The deployed Azure version may not have the same system-level reinforcement. The ChatGPT web version retains more conversational context than the Azure OpenAI API, which operates statelessly unless conversation history is managed manually. Azure OpenAI deployments may have different model versions, batch processing constraints, or token limitations that impact the output. These are the reason the ChatGPT Web Interface and Azure OpenAI Service Give Different Result.

    The below are methods to improve term retention in parentheses:

    1. Use a structured system message to enforce terminology retention: For this case- System: You are a professional translator specializing in psychoanalysis. Your task is to translate German psychoanalytic texts into English while ensuring that technical terms remain in parentheses next to their translations. For an example: German: Die Verdrängung ist ein zentraler Begriff in der Psychoanalyse. English: Repression (Verdrängung) is a central concept in psychoanalysis. Instructions:
      1. Translate the text while retaining the original German psychoanalytic terms in parentheses next to the translated term.
      2. Ensure proper sentence structure and linguistic accuracy.
      3. Do not alter the format of parentheses placement. Input: {German_Text} So, this ensures that the system consistently applies the format across translations.
    2. Instead of relying on the model to always remember to include terms in parentheses, post-process the translation by injecting terms manually using an external glossary. This is an example in Python:
         import re
         glossary = {
             "Verdrängung": "repression",
             "Affektregungen": "affective reactions",
             "Übertragung": "transference"
         }
         def enforce_parentheses_translation(text, glossary):
             for term, translation in glossary.items():
                 text = re.sub(rf'\b{translation}\b', f'{translation} ({term})', text, flags=re.IGNORECASE)
             return text
         translated_text = "Repression is a core concept in psychoanalysis."
         final_output = enforce_parentheses_translation(translated_text, glossary)
         print(final_output)
      
      This guarantees that the glossary terms always appear in parentheses, regardless of model inconsistencies.
    3. Azure OpenAI allows function calling. You can provide a structured glossary via an external function. For an example OpenAI Function Calling JSON Input:
         {
             "name": "translate_psychoanalysis",
             "description": "Translates German psychoanalytic texts into English while preserving technical terms in parentheses.",
             "parameters": {
                 "text": "string",
                 "glossary": "dictionary"
             }
         }
      
      This method ensure that translations adhere to predefined glossary terms instead of relying solely on the model.
    4. Now, if prompt engineering and programmatic term insertion do not yield sufficient accuracy, fine-tuning should be considered.
      1. Create a dataset with aligned source and target texts that demonstrate correct formatting.
      2. Train a fine-tuned model on Azure OpenAI** with specific focus on formatting and terminology preservation.
      3. Evaluate performance and adjust the glossary to reinforce consistency. Fine-tuning requires more resources but can provide higher consistency for niche use cases.

    I hope this is helpful! Do not hesitate to let me know if you have any other questions.


    Please don't forget to close up the thread here by upvoting and accept it as an answer if it is helpful.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.