Unable to create custom trainable classifier "Failed due to data collecting error"

Josh Wilson 0 Reputation points
2024-11-19T22:05:22.69+00:00

I am experiencing the following error of: "Failed due to data collecting error" after creating a Custom Trainable Classifier within Microsoft Purview. I provided the positive and negative samples (different folders within the same SharePoint site). Can someone please explain what I should do to resolve this issue? There is not much documentation on this error as it related to the creation of Trainable Classifiers. Also, I am unable to query certain SharePoint sites within Purview and the SharePoint sites in question are searchable within the SharePoint site settings. Please advise.

Microsoft Purview
Microsoft Purview
A Microsoft data governance service that helps manage and govern on-premises, multicloud, and software-as-a-service data. Previously known as Azure Purview.
1,247 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Chandra Boorla 3,460 Reputation points Microsoft Vendor
    2024-11-20T07:50:35.8866667+00:00

    Hi @Josh Wilson

    Greetings & Welcome to Microsoft Q&A forum! Thanks for posting your query!

    The error "Failed due to data collecting error" when creating a Custom Trainable Classifier in Microsoft Purview could be caused by several factors. Here are some troubleshooting steps that might help you in resolving the issue:

    Check Seed Data:

    • Ensure that the positive and negative seed content items are correctly placed in separate folders and that each folder contains only the respective seed content. The folders should be dedicated to holding only the seed data.

    Indexing Time:

    • If you create a new SharePoint site and folder for your seed data, allow at least an hour for that location to be indexed before creating the trainable classifier that will use that seed data.

    Permissions Check:

    • Ensure that the account you are using to create the classifier has the necessary permissions to access the SharePoint sites and folders where your positive and negative samples are stored. This includes having read access to the content within these folders.

    Data Source Configuration:

    • Verify that the SharePoint site URLs provided are correct and accessible. Sometimes, a typo or incorrect URL can lead to data collection errors. Make sure that the SharePoint sites are properly indexed and that the content within them is searchable.

    Network and Connectivity:

    • Check for any network issues that might be affecting connectivity to the SharePoint sites. Ensure that there are no firewall or proxy settings blocking access.

    Retry the Process:

    • If the issue persists, try to recreate the classifier by re-collecting the seed data and following the creation steps again. In the event of intermittent issues, attempting a retry process can often resolve the problem.

    For details, please refer: How to create a trainable classifier

    For a similar issue, you might find it helpful to check out this thread link for additional insights: https://learn.microsoft.com/en-us/answers/questions/2111187/unable-to-create-custom-trainable-classifier-due-t

    I hope this information helps. Please do let us know if you have any further queries.

    Thank you.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.