Search Results and Duplicate Detection in SharePoint Online
Search Results and Duplicate Detection in SharePoint Online
Assume there exist a document library and three different folders on that,
Folder A
Folder B
Folder C
When search for the term “Folder”, expected results would be
Folder A, Folder B, Folder C
But the actual results contains only Folder A. means even Folder B and Folder C indexed, those will not get included in search results.
Assessment
When finding the root cause for this,
First check whether all these items indexed as expected. Perform a search against the full name or full URL for Folder A, Folder B and Folder C. If these three items getting appear in search results, then it proves that Folder A, Folder B and Folder C already indexed as expected.
To check the indexed details in more detailed manner, contact Microsoft Customer Support.
Use Search Query Tool to find out what exact results getting returned here. This tool will give more flexibility to enable/disable query time settings to play with and identify the root cause here.
Get Search Query Tool from https://sp2013searchtool.codeplex.com/releases/view/119335
When querying with Trim Duplicate option (enable) and check the search results. Only Folder A will appear in search results.
Then perform the same with out Trim Duplicate Option (disable).
Folder A, Folder B and Folder C will appear in search results.
Conclusion
Even with different URLs and Folder names, SharePoint consider these three entries as duplicates. Because when calculating there document signature for each of these items, it falls in to a same range.
Workaround
As a workaround, while keeping trim duplicate enabled, can achieve the expected search results by implementing the below configuration changes,
1) For Search Results Web Part or Content Search Web Part
2) Select Edit Web Part
3) Select Change Query Builder --> Refiners
4) Under “Group Results” , go to “Group By” , then select “Show all properties“ and then select DocumentSignature
5) Then select a number for “Show these results” (select the highest here as 10)
6) After above changes try the same query as “Folder” , all the expected results as Folder A, Folder B and Folder C will be available among the results.
The above will act as an workaround for the scenario discussed here.