Including conversation-specific details in the fine-tuning data by varying the system prompts could indeed lead to inconsistencies in model behavior. While OpenAI's guidance on changing system messages suggests it might produce different results, your use case (where the system prompt varies between training examples) could introduce unintended biases or unpredictability in model responses.
In particular, the model learns patterns based on the training data. If the system prompt varies significantly, the model may become overly sensitive to minor prompt changes, leading to unexpected shifts in output quality. A fixed, well-crafted system prompt ensures generalization across different conversations. If the model sees many different system prompts during training, it may not develop a strong anchoring effect, making it harder to rely on the system prompt effectively.
If certain patterns in system prompts appear frequently in training, the model might bias responses toward those prompts, limiting flexibility.
Instead of embedding details in the fine-tuning dataset, consider structuring your API requests to include a few-shot learning approach where recent relevant details are part of the input messages. If the system prompt is getting too long, try moving some conversation-specific details into an initial user or assistant message. This keeps the system prompt concise while ensuring key details remain in focus. During fine-tuning, you can train the model to repeat or summarize the important details when responding, helping reinforce memory without overloading the system prompt. If some details are structured data (e.g., names, phone numbers), consider using a retrieval system where the model calls a function to fetch relevant context dynamically rather than embedding everything in the prompt.
From what I've seen in practical tests, varying system prompts in fine-tuning generally introduces inconsistencies. Instead, models tend to perform better when the system prompt remains stable, and additional details are provided dynamically within conversation turns.
If the above response helps answer your question, remember to "Accept Answer" so that others in the community facing similar issues can easily find the solution. Your contribution is highly appreciated.
hth
Marcin