ResourceNotReady Error when deploying an imported Hugging Face model for inferencing
Hello, I'm currently a student trying to create an endpoint and deploy a model from Hugging Face for inferencing. This is the model I have imported: https://huggingface.co/defog/sqlcoder-7b I am on an Azure for Students account and have a compute instance running on STANDARD_E4DS_V4. I have followed the following notebook within Azure ML Studio for importing this model: https://github.com/Azure/azureml-examples/blob/main/sdk/python/foundation-models/system/import/import_model_into_registry.ipynb I'm able to import the model, but I'm encountering issues with deploying and trying to create an endpoint. I am then following this notebook to deploy my model: https://github.com/Azure/azureml-examples/blob/main/sdk/python/using-mlflow/deploy/mlflow_sdk_online_endpoints.ipynb
However, I am encountering this error when running the following cell:
deployment = deployment_client.create_deployment(
name=deployment_name,
endpoint=endpoint_name,
model_uri=f"models:/{model_name}/{version}",
config={"deploy-config-file": deployment_config_path},
)
................................................................................................................................................---------------------------------------------------------------------------
OperationFailed Traceback (most recent call last)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/polling/base_polling.py:466, in LROBasePolling.run(self)
465 try:
--> 466 self._poll()
468 except BadStatus as err:
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/polling/base_polling.py:500, in LROBasePolling._poll(self)
499 if _failed(self.status()):
--> 500 raise OperationFailed("Operation failed or canceled")
502 final_get_url = self._operation.get_final_get_url(self._pipeline_response)
OperationFailed: Operation failed or canceled
During handling of the above exception, another exception occurred:
HttpResponseError Traceback (most recent call last)
Cell In[23], line 1
----> 1 deployment = deployment_client.create_deployment(
2 name=deployment_name,
3 endpoint=endpoint_name,
4 model_uri=f"models:/{model_name}/{version}",
5 config={"deploy-config-file": deployment_config_path},
6 )
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azureml/mlflow/deploy/deployment_client.py:137, in AzureMLDeploymentClient.create_deployment(self, name, model_uri, flavor, config, endpoint)
134 deployment = self._v1_create_deployment(name, model_name, model_version, config,
135 v1_deploy_config, no_wait)
136 else:
--> 137 deployment = self._v2_create_deployment_new(name, model_name, model_version, v2_deploy_config, endpoint)
139 if 'flavor' not in deployment:
140 deployment['flavor'] = flavor if flavor else 'python_function'
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azureml/mlflow/deploy/deployment_client.py:522, in AzureMLDeploymentClient._v2_create_deployment_new(self, name, model_name, model_version, v2_deploy_config, endpoint)
520 # Create Deployment using v2_deploy_config
521 endpoint_name = endpoint if endpoint else name
--> 522 self._mir_client.create_online_deployment(deployment_config=v2_deploy_config,
523 deployment_name=name,
524 endpoint_name=endpoint_name, model_name=model_name,
525 model_version=model_version)
527 if not endpoint:
528 _logger.info('Updating endpoint to serve 100 percent traffic to deployment {}'.format(name))
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azureml/mlflow/deploy/_mir/mir_deployment_client.py:132, in MirDeploymentClient.create_online_deployment(self, deployment_config, deployment_name, endpoint_name, model_name, model_version, **kwargs)
130 if no_wait is False:
131 _logger.info("Creating deployment {}".format(deployment_name))
--> 132 poller.result(timeout=3600)
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/polling/_poller.py:230, in LROPoller.result(self, timeout)
222 def result(self, timeout: Optional[float] = None) -> PollingReturnType:
223 """Return the result of the long running operation, or
224 the result available after the specified timeout.
225
(...)
228 :raises ~azure.core.exceptions.HttpResponseError: Server problem with the query.
229 """
--> 230 self.wait(timeout)
231 return self._polling_method.resource()
File /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages/azure/core/tracing/decorator.py:76, in distributed_trace.