Copilot bad indexing pdfs from Sharepoint.

JJSC 20 Reputation points
2025-02-17T13:01:37.2033333+00:00

Hello everyone hope to fin you well,

Was asking myself about the sometimes random bad indexing with certain pdfs while indexing them via Sharepoint Docs as knowledge source.

Some PDFS (being pointed to the folder or path where they lie or directly to the document, it doesn't care) are perfectly read and its information delivered to the chat questions, but other documents simply there's no way to be recognised by the agent. Those documents can be at the same folder level, can be equal in permissions and they can look similar (to the eye) as to images/tables/data structure, being different obviously in their contents but the bot simply doesnt deliver responses about them.

I read about a possible maximum 4 URLS per generative answer on topic node in classic mode (wich is my case) but testing out the agent with 4 urls doesn't show any improvement.

SharePoint
SharePoint
A group of Microsoft Products and technologies used for sharing and managing content, knowledge, and applications.
11,232 questions
Microsoft Copilot
Microsoft Copilot
Microsoft terminology for a universal copilot interface.
601 questions
{count} votes

Accepted answer
  1. Yanli Jiang - MSFT 29,286 Reputation points Microsoft Vendor
    2025-02-19T06:41:50.63+00:00

    Hi @JJSC ,

    Good days.

    As a SharePoint engineer, I'm not very good at Microsoft Copilot. From a SharePoint perspective, let’s do some troubleshooting:

    1. File Size and Content: If the PDFs are large or contain complex formatting (like tables or images), this could hinder the agent's ability to index them properly. It's recommended to keep files to a maximum of 36,000 characters (approximately 15-20 pages) to improve indexing reliability.
    2. Permissions: Even if the documents appear to have the same permissions, ensure that the user accessing the documents has the necessary read permissions on all relevant sites and files. Lack of permissions can lead to no results being returned.

    If you continue to face issues, consider reviewing the configuration settings or consulting the documentation for further troubleshooting steps.

    References:

    I hope this information helps.

    Kindly consider upvoting the comment if the information provided is helpful. This can assist other community members in resolving similar issues.


0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.