(new)AML Continuous Monitoring | Mapping Ground Truth Data for Model Performance Monitoring in Azure Machine Learning

BOON HAWAREE 20 Reputation points
2025-01-07T20:45:32.03+00:00

This is the new post as my first post couldn't be viewed( shown 404 error after I clicked it)

Greetings,

I am reading the document regarding MLOps and would like to continuously monitor the ML model deployed inside Azure Machine Learning. The goal of monitoring performance metrics is to send the signal to Azure Event and function triggering the training pipeline to retrain the model if the performance drift is detected.

Here’s what I’ve accomplished so far:

  1. Completed data ingestion, ETL, model training, and registration using MLflow.
  2. Successfully deployed the model and tested the endpoint by executing the POST method via Postman.
  3. Enabled the data collector, and I can view the input and output production data in Blob Storage.
  4. Configured a training pipeline that can be triggered by an Azure Function executing the POST method.

My problem is the monitoring part and sending the model's performance drift signal to Azure Event Grid. Despite reading and following the steps in the document ( https://learn.microsoft.com/en-us/azure/machine-learning/how-to-monitor-model-performance?view=azureml-api-2&tabs=azure-cli ) I could see the correlationID in each row of output and understand that I need to map it with ground truth data to calculate the performance. (if I understand it correctly)

I’m struggling with the following:

  1. How do I retrieve the ground truth data and ensure it maps correctly to the corresponding correlationID in the model output?
  2. As this is a proof-of-concept (POC) project using a mock dataset, what are the best practices or recommendations for setting up the monitoring and retraining scenario effectively?

I’d greatly appreciate any advice, recommendations, or insights on how to address this issue.

Thank you for taking the time to read my question and for your help in advance!

Best regards,
Boon Hawaree

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,073 questions
Azure Functions
Azure Functions
An Azure service that provides an event-driven serverless compute platform.
5,331 questions
Azure Event Grid
Azure Event Grid
An Azure event routing service designed for high availability, consistent performance, and dynamic scale.
415 questions
{count} votes

1 answer

Sort by: Most helpful
  1. Vikram Singh 415 Reputation points Microsoft Employee
    2025-01-08T07:32:12.85+00:00

    Hello @BOON HAWAREE ,

    Welcome to Microsoft Q&A Forum, Thank you for posting your question here!

    To effectively monitor ML models in Azure Machine Learning and retrieve ground truth data, here are the best practices and steps you should follow:

    To retrieve ground truth data and ensure it maps correctly to the corresponding correlationID in the model output, follow these steps:

    1. Setting Up Ground Truth Data Collection • Define Ground Truth: Clearly define what constitutes the ground truth for your model. • Data Logging: Implement a logging mechanism to capture ground truth data along with the correlationID.
    2. Storing Ground Truth Data • Choose a Storage Solution: Store ground truth data in a structured format, such as: ○ Azure Blob Storage: For unstructured data. ○ Azure SQL Database: For structured data. ○ Azure Data Lake: For large-scale data storage. • Data Format: Ensure ground truth data includes the correlationID for easy mapping.
    3. Retrieving Ground Truth Data: Use Azure SDKs or REST APIs to retrieve ground truth data. For example, using Python with the Azure SDK to read data from Azure Blob Storage.
    4. Mapping Ground Truth to Model Output: Join the model output and ground truth data using the correlationID.

    Best Practices for Monitoring ML Models

    1. Data Collection and Logging • Enable Data Collection: Ensure data collection is enabled for both input and output data to track model performance over time. • Log Correlation IDs: Use unique identifiers (like correlationID) for each prediction to facilitate easy mapping between model outputs and ground truth data.
    2. Performance Metrics • Define Key Metrics: Identify key performance metrics relevant to your model • Monitor Drift: Set up monitoring for data drift and prediction drift to identify when model performance degrades due to changes in input data distribution.
    3. Automated Monitoring • Use Azure Monitor: Leverage Azure Monitor to track performance metrics and set up alerts for when metrics fall below acceptable thresholds. • Integrate with Azure Event Grid: Configure Azure Event Grid to listen for events related to model performance, triggering automated workflows like retraining.
    4. Ground Truth Data Management • Collect Ground Truth Data: Ensure a reliable method for collecting ground truth data, storing it in a structured format linked to model outputs using correlation IDs. • Regular Updates: Regularly update ground truth data to reflect current information for accurate performance evaluation.
    5. Visualization and Reporting • Dashboards: Create dashboards using Azure Machine Learning Studio or Power BI to visualize model performance metrics over time. • Regular Reporting: Set up regular reporting mechanisms to review model performance with stakeholders.

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.