Azure OpenAI - Unable to Answer from Retrieved Documents
Hi
I have developed an integrated pipeline leveraging Azure AI Search and Azure OpenAI. Utilizing the REST API, I have been initiating requests to Azure OpenAI, specifying the Index within the data sources section. Upon submitting a straightforward query, Azure OpenAI responded with:
"The requested information is not found in the retrieved data. Please try another query or topic."
I began troubleshooting the query by using the dedicated REST API endpoint of Azure AI Search to cross-verify the retriever. Using a hybrid search approach and including both the query and vector in the request body, I successfully retrieved the exact documents the query targeted.
So, the retriever is successfully providing the context/documents to Azure OpenAI, but it fails to extract the required information from them. How can I resolve the issue where the necessary information is present in the document and Azure AI Search is successfully returning those documents, yet Azure OpenAI is unable to find the answer?
Azure AI Search
Azure OpenAI Service
-
Charlie Wei • 3,335 Reputation points
2024-03-18T12:51:36.3866667+00:00 @Hammad Hassan, could you please provide the REST API endpoint and body for both a successful and a failed request?
-
Hammad Hassan • 40 Reputation points
2024-03-18T17:06:30.18+00:00 Sure.
Azure OpenAI endpoint
{ "messages": [ { "role": "system", "content": "You are an answer generation assistant. Use the given context to answer the question. ...... Context: {context} Question: {question} Answer (formatted in markdown):" }, { "role": "user", "content": "MY_QUERY" } ], "stream": false, "temperature": 0.0, "top_p": 1.0, "max_tokens": 1000, "dataSources": [ { "type": "AzureCognitiveSearch", "parameters": { "endpoint": "https://MY_AI_SEARCH.search.windows.net", "key": "123", "indexName": "test-123", "topNDocuments": 2, "embeddingDeploymentName": "embedding-ada-v2", "queryType": "vectorSimpleHybrid", "fieldsMapping": { "titleField": "source", "filepathField": "source_id" }, "roleInformation": "You are an answer generation assistant. Use the given context to answer the question. ...... Context: {context} Question: {question} Answer (formatted in markdown):", "outputFieldMappings": [ { "sourceFieldName": "/document/source_id", "targetFieldName": "source_id" }, { "sourceFieldName": "/document/source", "targetFieldName": "source" } ] } } ] }
After that for debugging, first of all I generated the embeddings of query and then sent that query with embedding to Azure AI Search to make sure that document is being retrieved or not.
EMBEDDING
https:///MY_OPEN_AI_STUDIO.openai.azure.com/openai/deployments/embedding-ada-v2/embeddings?api-version=2023-12-01-preview
{ "input": "MY_QUERY" }
Azure AI Search
http://MY_AI_SEARCH.search.windows.net/indexes/test-123/docs/search?api-version=2023-11-01
{ "count": true, "search": "MY_QUERY", "select": "content, source", "top": 2, "vectorQueries": [ { "vector": [ .., .. ], "k": 7, "fields": "vector", "kind": "vector", "exhaustive": false } ] }
This last endpoint returns two documents, and the first one is precisely the document from which I made the query. So, Azure AI Search is retrieving the documents, but somehow Azure OpenAI is unable to derive answers from those documents.
-
Hammad Hassan • 40 Reputation points
2024-03-18T17:30:24.6266667+00:00 Sure.
Azure Open AI
{ "messages": [ { "role": "system", "content": "You are an answer generation assistant. Use the given context to answer the question. ...... Context: {context} Question: {question} Answer (formatted in markdown):" }, { "role": "user", "content": "MY_QUERY" } ], "stream": false, "temperature": 0.0, "top_p": 1.0, "max_tokens": 1000, "dataSources": [ { "type": "AzureCognitiveSearch", "parameters": { "endpoint": "https://MY_AI_SEARCH.search.windows.net", "key": "123", "indexName": "test-123", "topNDocuments": 2, "embeddingDeploymentName": "embedding-ada-v2", "queryType": "vectorSimpleHybrid", "fieldsMapping": { "titleField": "source", "filepathField": "source_id" }, "roleInformation": "You are an answer generation assistant. Use the given context to answer the question. ...... Context: {context} Question: {question} Answer (formatted in markdown):", "outputFieldMappings": [ { "sourceFieldName": "/document/source_id", "targetFieldName": "source_id" }, { "sourceFieldName": "/document/source", "targetFieldName": "source" } ] } } ] }
This endpoint returns that requested information is not found. After that, I started debugging and used the following two endpoints to confirm the retrieved documents.
{ "input": "MY_QUERY" }
Azure AI Search
http://MY_AI_SEARCH.search.windows.net/indexes/test-123/docs/search?api-version=2023-11-01{ "count": true, "search": "MY_QUERY", "select": "content, source", "top": 2, "vectorQueries": [ { "vector": [ .., .. ], "k": 7, "fields": "vector", "kind": "vector", "exhaustive": false } ] }
This last endpoint returns two documents, and the first one is precisely the document from which I made the query. So, Azure AI Search is retrieving the documents, but somehow Azure OpenAI is unable to derive answers from those documents.
-
Hammad Hassan • 40 Reputation points
2024-03-25T11:56:03.21+00:00 Is there any update on the above query?
-
Hammad Hassan • 40 Reputation points
2024-03-25T11:57:36.31+00:00 @VasaviLankipalle-MSFT Can you also look into this? Thanks
-
AshokPeddakotla-MSFT • 35,931 Reputation points
2024-04-04T11:55:28.2833333+00:00 Hammad Hassan Greetings!
Did you try setting up the
inScope
parameter to true?Try setting this parameter inside datasources and see if that solves your issue.
This flag configures the chatbot's approach to handling queries unrelated to the data source or when search documents are insufficient for a complete answer. When this setting is disabled, the model supplements its responses with its own knowledge in addition to your documents. When this setting is enabled, the model attempts to only rely on your documents for responses. This is the
inScope
parameter in the API, and set to true by default.Please refer to the documentation - Using your data(Preview) and Azure OpenAI Service REST API reference for more details.
I hope this solves your issue.
Do let me know if you have any further queries.
-
Hammad Hassan • 40 Reputation points
2024-04-05T06:27:08.41+00:00 Hi
Thank you for your reply. Please note that my query directly relates to the data source, and the documents I cross-checked using the Azure AI search endpoint are entirely sufficient to answer the query. I prefer not to have the LLM use its own knowledge base for answering this query. Therefore, the default value of
inScope
seems most appropriate for my use case.My main question concerns how Azure OpenAI retrieves documents: Does its retrieval system operate identically to Azure AI search when processing the same query with the same configuration?
-
Kai Meng • 0 Reputation points
2025-02-18T11:40:05.9566667+00:00 I'm running into the same problem as Hammad. I've done a lot of testing and it seems that for me the problem occurs for one question in particular and one particular phrasing of the question makes the api unable to return an answer.
My specific example is the question "Can I add additional info to my tracks?" consistently refuses to give an answer even though the documents I provided have that exact wording, while the question "How can I add additional info to my tracks?" consistently gives me the correct answer including text the phrase "...you can add additional info to ..."
I also set up the semantic configuration incorrectly during one test and received an error for the second question, but for the first question it gave the response that the "information is not in the retrieved documents". This suggests to me that it's not even reaching the step of querying the index for some questions. Is there any way to debug or get some more info as to why the api returns a "information is not in the retrieved documents" answer? Is there a threshold to some score before it decides a question is out of scope?
Sign in to comment