Azure SDK v2 - unable to download job output using mlclient

Kuwar,Rakesh 5 Reputation points
2023-09-18T07:07:06.7433333+00:00

I was using functionality from last few months, but recently started facing some issue with the below code. I'm running the code using JupyterLab and unable to download the predictions result. However, The batch endpoint is getting invoked and able to stream the job. Additionally, I can see the prediction output saved in Azure Blob Storage.

I'm also not getting any error message.


# invoke the endpoint for batch scoring job
print("Invoking the Batch Endpoint...")
job = ml_client.batch_endpoints.invoke(endpoint_name=batch_endpoint_name,  input=score_dataset_input,                    
deployment_name=batch_deployment_name,     
params_override=[
    {"mini_batch_size": str(mini_batch_size)}, 
    {"compute.instance_count": str(compute_instance_count)},
    {"output_file_name": f"{prediction_output_file_name}.csv"}
 ]           

)
print("Batch Endpoint invoked with the provided payload....")

print("Streaming the Job...")
job_name = job.name
batch_job_stream = ml_client.jobs.stream(name=job_name)

# download the job logs and output
ml_client.jobs.download(batch_job.name, 
                    download_path= f"{batch_prediction_dir}/csv/", 
                    output_name="predictions")

Azure Machine Learning
Azure Machine Learning
An Azure machine learning service for building and deploying models.
3,212 questions
0 comments No comments
{count} vote

2 answers

Sort by: Most helpful
  1. Amira Bedhiafi 31,296 Reputation points
    2023-09-18T12:41:03.6633333+00:00

    Make sure that the identity running the code has the necessary permissions to read from the Azure Blob Storage where the outputs are stored.

    Also, you can check the job's status to verify that it has indeed finished.

       job_status = job.get_status()
       print(f"Job status: {job_status}")
    

    Only proceed to download if the status indicates completion.

    You mentioned that there aren't any error messages. Try catching any potential exceptions that might be thrown silently.

       try:
           ml_client.jobs.download(batch_job.name, 
                                   download_path=f"{batch_prediction_dir}/csv/", 
                                   output_name="predictions")
       except Exception as e:
           print(f"Error encountered: {e}")
    

    Ensure that the download_path you provided exists and is accessible. You can use the os module to check or create the directory:

       import os
       download_dir = f"{batch_prediction_dir}/csv/"
       if not os.path.exists(download_dir):
           os.makedirs(download_dir)
    

    Check if batch_job.name actually corresponds to a job that exists. You can list the jobs and see if your job is there.

     jobs = ml_client.jobs.list()
       for j in jobs:
           print(j.name)
    

    Share the output for each step so we can help you :)


  2. Rishi Malhotra 0 Reputation points Microsoft Employee
    2024-12-02T04:57:17.08+00:00

    Hi, I was running into the same issue. I have realized that the command ml_client.jobs.download is for downloading named outputs and not for arbitrary outputs produced during the run. When you submit the command you need to specify your outputs.

        job = command(
            inputs={
                "data": Input(
                    path=data_asset.id,
                    type=AssetTypes.URI_FOLDER,
                    mode=InputOutputModes.RO_MOUNT,
                )
            },
            code=".",  # location of source code
            # --weights-id job_id:models/step_{step_number}/lora_adapter
            # command="pip install -r requirements.txt && python train.py --data-path ${{inputs.data}} --weights-id None"
            command='pip install -r requirements.txt && python train.py --data-path ${{inputs.data}}'
            outputs={
                "model": Output(
                    type="uri_folder",
                    path="models",
                )
            }
            compute=compute_target,
            environment=env,
            display_name=display_name,
        )
    
    
    
    

    An alternative is to download the outputs straight from azure blob storage without using blob storage. You should use azure blob storage apis for this.

    In my case, I wanted to download model weights from job outputs. A better way of doing this is to register the model and use

    ml_client.models.download(name="my_model_name", version=1, download_path=".")
    
    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.