From the message you are getting, I understand that the compute cluster is unable to pull the Docker image from the ACR due to authentication issues.
You need to verify if the user-assigned managed identity (MSI) has the AcrPull
role assigned to it on the ACR.
If you are not using a managed identity, check that the Admin user is enabled on the ACR.
If the ACR Admin user password was changed recently, synchronize the workspace keys:
- Navigate to your Azure Machine Learning workspace in the Azure portal.
- Go to Settings > Keys.
- Click on Sync keys.
Another thing, verify that the VNet and custom DNS settings are correctly configured to allow the compute cluster to access the ACR.
The compute cluster should be able to reach the ACR endpoint over the internet or through a private link so check that there are no NSGs or firewalls blocking access to the ACR.