AzureML batch data monitoring

Question

I'm working on a batch use case and attempting to setup model monitoring. I've had issues getting it setup and have a few questions.

Documentation notes data collector as strictly for online endpoints, is this true?
Is model monitoring implemented vith sdkv2 or sdkv1?
Which version of CLI should model monitoring schedules be created with, in trying to create a custom signal following github (azureml-examples) I get type: custom not in set.
The documentation is lacking on batch monitoring, are there any better references?
Out of interest, why does model monitoring train light gbm models under the hood, and how can I disable this feature?

Answer

Hi Fergus Currie

Welcome to Azure AI Q and A. Thank you for posting your query here.

Yes, you are correct on "Collector" library from azure.ai.monitoring being only applicable for real time endpoint. But implementing Data drift, Data collection and model monitoring is supported for batch endpoints too. You have to only take care of data collection part for batch endpoints, Rest of steps are same for batch endpoint model monitoring. (like using monitoring signals, using reference data, configuring windowing operations etc.)

Reference

Reference on possibility for batch endpoints

Data collection (for both inputs and outputs) for batch endpoint can be done either by Data factory (limited to V1 datasets only) or Fabric or batch endpoint SDK itself.

Once Data collection is taken care.

You can check data drift, Data quality Feature drift by comparing production dataset against reference dataset. Please get familiar with Data windowing operations mentioned in this doc.
You can do model monitoring through SDK/CLI (for UI, batch endpoint might not pop up with deployment with collection enabled, Optional though)

Reference

Reference on Portal implementation of continuous monitoring

I also found one documentation from Azure ML observability which includes possibility of both real and batch endpoints and gives step by step details of implementation.

Azure ML observability

Regarding other queries

Documentation notes data collector as strictly for online endpoints, is this true? - Collector from azure.ai.monitoring supports only online endpoints. Reference
Is model monitoring implemented vith sdkv2 or sdkv1? - SDK v2 (brought in SDK v2)
Which version of CLI should model monitoring schedules be created with, in trying to create a custom signal following github (azureml-examples) I get type: custom not in set -
CLI 2.0 - Reference
The documentation is lacking on batch monitoring, are there any better references? - Azure ML observability (you will be able to map the context to existing documentation but not sure on implementation, you have to consult with github team for it as it is 3-year-old doc)
Out of interest, why does model monitoring train light gbm models under the hood, and how can I disable this feature? - This is part of built in monitoring signals, you can use CLI/SDK to use your own custom signal, but you have to register the customer signal as component first.

You can provide feedback on model monitoring page itself and post your idea on feedback forum for model monitoring documentation for batch endpoints too.

If the pointers in this answer were helpful to you, please don't forget to upvote for this answer.

Thank you.

Share via

AzureML batch data monitoring

1 answer

Your answer