Azure SDK v2 - unable to download job output using mlclient

Question

Azure SDK v2 - unable to download job output using mlclient

Kuwar,Rakesh 5

I was using functionality from last few months, but recently started facing some issue with the below code. I'm running the code using JupyterLab and unable to download the predictions result. However, The batch endpoint is getting invoked and able to stream the job. Additionally, I can see the prediction output saved in Azure Blob Storage.

I'm also not getting any error message.


# invoke the endpoint for batch scoring job
print("Invoking the Batch Endpoint...")
job = ml_client.batch_endpoints.invoke(endpoint_name=batch_endpoint_name,  input=score_dataset_input,                    
deployment_name=batch_deployment_name,     
params_override=[
    {"mini_batch_size": str(mini_batch_size)}, 
    {"compute.instance_count": str(compute_instance_count)},
    {"output_file_name": f"{prediction_output_file_name}.csv"}
 ]           

)
print("Batch Endpoint invoked with the provided payload....")

print("Streaming the Job...")
job_name = job.name
batch_job_stream = ml_client.jobs.stream(name=job_name)

# download the job logs and output
ml_client.jobs.download(batch_job.name, 
                    download_path= f"{batch_prediction_dir}/csv/", 
                    output_name="predictions")

2 answers

Your answer

Answer 1

Amira Bedhiafi 31,296

Make sure that the identity running the code has the necessary permissions to read from the Azure Blob Storage where the outputs are stored.

Also, you can check the job's status to verify that it has indeed finished.

   job_status = job.get_status()
   print(f"Job status: {job_status}")

Only proceed to download if the status indicates completion.

You mentioned that there aren't any error messages. Try catching any potential exceptions that might be thrown silently.

   try:
       ml_client.jobs.download(batch_job.name, 
                               download_path=f"{batch_prediction_dir}/csv/", 
                               output_name="predictions")
   except Exception as e:
       print(f"Error encountered: {e}")

Ensure that the download_path you provided exists and is accessible. You can use the os module to check or create the directory:

   import os
   download_dir = f"{batch_prediction_dir}/csv/"
   if not os.path.exists(download_dir):
       os.makedirs(download_dir)

Check if batch_job.name actually corresponds to a job that exists. You can list the jobs and see if your job is there.

 jobs = ml_client.jobs.list()
   for j in jobs:
       print(j.name)

Share the output for each step so we can help you :)

Tadikonda Tarun HYD DIWID23 20 Reputation points

2024-01-24T08:16:52.4133333+00:00

I am facing similar kind of issue. download function is able to download whole artifact if output_name is not provided. But specific file or folder name is assigned to output_name then its not working.
Amira Bedhiafi 31,296 Reputation points

2024-01-24T08:44:38.3366667+00:00

@Tadikonda Tarun HYD DIWID23 you can share your issue and get back to us :)
Tadikonda Tarun HYD DIWID23 20 Reputation points

2024-01-24T08:50:35.1266667+00:00

https://learn.microsoft.com/en-us/answers/questions/1510213/unable-to-download-output-of-job-using-mlclient-az

Answer 2

Hi, I was running into the same issue. I have realized that the command ml_client.jobs.download is for downloading named outputs and not for arbitrary outputs produced during the run. When you submit the command you need to specify your outputs.

    job = command(
        inputs={
            "data": Input(
                path=data_asset.id,
                type=AssetTypes.URI_FOLDER,
                mode=InputOutputModes.RO_MOUNT,
            )
        },
        code=".",  # location of source code
        # --weights-id job_id:models/step_{step_number}/lora_adapter
        # command="pip install -r requirements.txt && python train.py --data-path ${{inputs.data}} --weights-id None"
        command='pip install -r requirements.txt && python train.py --data-path ${{inputs.data}}'
        outputs={
            "model": Output(
                type="uri_folder",
                path="models",
            )
        }
        compute=compute_target,
        environment=env,
        display_name=display_name,
    )

An alternative is to download the outputs straight from azure blob storage without using blob storage. You should use azure blob storage apis for this.

In my case, I wanted to download model weights from job outputs. A better way of doing this is to register the model and use

ml_client.models.download(name="my_model_name", version=1, download_path=".")

Share via

Azure SDK v2 - unable to download job output using mlclient

2 answers

Your answer