How to fix an endpoint that is perpetually stuck updating

CB 0 Reputation points
2024-11-22T17:10:11.66+00:00

I have a batch endpoint that I was adding a new deployment to, and the Provisioning state of the endpoint is stuck on updating.

This means that I cannot take any other action concerning this endpoint, be it another deployment or deleting the endpoint, as it clashes with the updating process it is stuck on.

There were 3 successful deployments prior to this broken state.

In the same workspace, another endpoint has broken with the Provisioning state of the endpoint set to succeeded but the deployment Provisioning state is stuck on Creating.

I'd like to end these actions so that I can update the endpoints & continue to use, or in the worst case delete them so there aren't broken endpoints hanging around.

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,112 questions
{count} votes

3 answers

Sort by: Most helpful
  1. Azar 26,180 Reputation points MVP
    2024-11-22T20:13:56.6166667+00:00

    Hi there CB

    Thanks for using QandA platform

    you can try a few steps. First, check for any active operations that might be causing the endpoint to be stuck. If there are no ongoing tasks, you can attempt to cancel any stuck operations using the Azure CLI with the az ml endpoint operation cancel command. Also, verify if there are any resource quota issues that could be preventing the update.

    If the endpoint remains unresponsive, you can try force-deleting it using the az ml endpoint delete command.

    If this helps kindly accept the response thanks much.


  2. CB 0 Reputation points
    2025-02-10T20:31:24.6766667+00:00

    The issue was in the AzureML backend, after trying many front end fixes, support liaised with the developers in order to manually force an update to the state of the end points.

    I didn't get emailed about these comments, so did not know they had been made.

    0 comments No comments

  3. Manas Mohanty (Quadrant Resource LLC) 295 Reputation points Microsoft Vendor
    2025-02-11T08:37:55.3066667+00:00

    Hi Gary Cowan !

    We have noticed that you rated an answer as not helpful. We appreciate your feedback and are committed to improving your experience with the Q&A.

    Sorry for the inconvenience.

    Re-attached commands for force update/cancel of batch endpoint

    #deleting batch endpoint
    az ml batch-endpoint delete --name <endpointname> --resource-group <rgname> -- workspace-name <workspace name> --no-wait True
    
    #force update
    az ml batch-endpoint update --name <endpointname> --resource-group <rgname> -- workspace-name <workspace name> --no-wait True
    
    

    But please note that endpoint provisioning also fails if the computes does not have sufficient memory or syntax issues in scoring script or corrupt environments

    We should check in endpoint logs for actual issue and act accordingly.

    Reference

    Troubleshoot Batch endpoint

    If you find this answer useful, please upvote for this answer.

    Thank you.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.