Analyze data in a review set in eDiscovery
When the number of collected documents is large, it can be difficult to review them all. eDiscovery provides tools to analyze the documents to reduce the volume of documents to be reviewed without any loss in information, and to help you organize the documents in a coherent manner. To learn more about these capabilities, see:
Tip
Get started with Microsoft Security Copilot to explore new ways to work smarter and faster using the power of AI. Learn more about Microsoft Security Copilot in Microsoft Purview.
Run analytics for a review set
To analyze data in a review set, complete the following steps:
- Go to the Microsoft Purview portal and sign in using the credentials for a user account assigned eDiscovery permissions.
- Select the eDiscovery solution card and then select Cases in the left nav.
- Select a case.
- (Optional) select Case settings.
- On the Case settings page, select Search & analytics.
- On the Search & analytics page, configure analytics settings for your case.
- After selecting the applicable search and analytics options, select Save. Close the Case settings page.
- Select the Review sets tab.
- Select the review set that you want to export and select Open review set.
- Select Analytics > Run document & email analytics.
- A notification dialog is displayed about the time it takes to complete the analytics process. Select Yes to continue with running the analytics process or select Cancel.
- Select OK when notified that the analyze process is created.
You can check the progress of the analysis process by selecting Process manager in the review set. The process type is Run analytics.
After the analysis process is completed, you can view the analytics report, run queries within your review set on outputs of the analysis, and see related documents of a given document.
Using the For Review (preview) filter query
After running analytics for the review set, you can use an automatically generated filter query (called For Review) that filters your review to exclude immaterial, duplicate, or noninclusive items. This leaves you with only the items that are representative, unique, and inclusive in the review set.
To apply the For Review - Unique items only (preview) filter query to a review set, select the Saved filter queries dropdown list, and then select [AutoGen] For Review - Unique items only (preview). Minimize the query details to view the results.
Here's the syntax for the For Review - Unique items only (preview) filter query:
(((FileClass="Email") AND (IsInclusive="True") AND (MarkAsRepresentative="Unique")) OR ((FileClass="Attachment") AND (MarkAsRepresentative="Unique")) OR ((FileClass="Document") AND (MarkAsRepresentative="Unique")) OR ((FileClass="Conversation") AND (MarkAsRepresentative="Unique")) OR ((FileClass="Attachment" OR FileClass="Conversation" OR FileClass="Document" OR FileClass="Email") AND NOT (Exists:MarkAsRepresentative)))
The following list describes the result of the filter query in terms of what content is displayed after you apply it to the review set.
- Email: Displays items that are marked as both Inclusive and Representative. An email is marked as IsInclusive True when it contains all the unique content from a thread, including all previous replies. It ensures that only the most comprehensive email in a thread is reviewed, which is essential for understanding the full context of the conversation without having to review each individual reply. An email is MarkedAsRepresentative Unique when it's the single representative copy among duplicates.
- Attachments: Filters out duplicate attachments in the review set. Only attachments that are unique in the review set are displayed.
- Documents: Filters out duplicate documents. Only documents that are unique in the review set are displayed.
- Teams conversations: Filter out duplicate Teams (and Viva Engage) conversations. Only conversations that are unique in the review set are displayed.
- Other: Any emails, attachments, documents, and conversations in the review set where the MarkedAsRepresentative field contains doesn't contain a value. The empty value means that analytics wasn't able to generate a value, so you should review and assess these items.
For more information about inclusive types and document uniqueness, see Email threading in eDiscovery).
Using the Potentially Immaterial Items filter query
The potentially immaterial query is designed to filter out specific types of files which could be immaterial in a review set. It targets two main categories:
Compressed Files: It filters out files with extensions commonly associated with compressed or archived formats, such as .zip.
Small Image Files: It also filters out image files that are smaller than 3,072 bytes (3 KB). These small image files are typically thumbnails, icons, or low-resolution images.
To apply the Potentially Immaterial Items filter query to a review set, select the Saved filter queries dropdown list, and then select [AutoGen] Potentially Immaterial Items. Minimize the query details to view the results.
Analytics report
To view the analytics report for a review set, complete the following steps:
- Go to the Microsoft Purview portal and sign in using the credentials for a user account assigned eDiscovery permissions.
- Select the eDiscovery solution card and then select Cases in the left nav.
- Select a case, then select the Review sets tab.
- Select a review set, select Open review set.
- Select Analytics > Show reports.
The Analytics report has seven components from the analysis. Hover over the results to view counts for each component:
- Target population: The number of email messages, attachments, and loose documents found in the review set.
- Documents (excluding attachments): The number of loose documents that are pivots, unique near duplicates of a pivot, or an exact duplicate of another document.
- Emails: The number of email messages that are marked as inclusive.
- Attachments: The number of email attachments that are unique or duplicates of another email attachment in the review set.
- Number documents by file type: The number of files, identified by file extension.
- Documents by source: A summary of content by its original data source.
- Documents aggregated by process: A summary of content by review set processes.