ml Package
Packages
automl |
Contains automated machine learning classes for Azure Machine Learning SDKv2. Main areas include managing AutoML tasks. |
constants |
This package defines constants used in Azure Machine Learning SDKv2. |
data_transfer | |
dsl | |
entities |
Contains entities and SDK objects for Azure Machine Learning SDKv2. Main areas include managing compute targets, creating/managing workspaces and jobs, and submitting/accessing model, runs and run output/logging etc. |
finetuning |
Contains custom model finetuning classes for AzureML SDK V2. |
identity |
Contains Identity Configuration for Azure Machine Learning SDKv2. |
model_customization | |
operations |
Contains supported operations for Azure Machine Learning SDKv2. Operations are classes contain logic to interact with backend services, usually auto generated operations call. |
parallel | |
sweep |
Modules
exceptions |
Contains exception module in Azure Machine Learning SDKv2. This includes enums and classes for exceptions. |
Classes
Input |
Initialize an Input object. |
MLClient |
A client class to interact with Azure ML services. Use this client to manage Azure ML resources such as workspaces, jobs, models, and so on. |
MpiDistribution |
MPI distribution configuration. |
Output |
Define an output. |
PyTorchDistribution |
PyTorch distribution configuration. |
RayDistribution |
Note This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information. Ray distribution configuration. |
TensorFlowDistribution |
TensorFlow distribution configuration. |
Functions
command
Creates a Command object which can be used inside a dsl.pipeline function or used as a standalone Command job.
command(*, name: str | None = None, description: str | None = None, tags: Dict | None = None, properties: Dict | None = None, display_name: str | None = None, command: str | None = None, experiment_name: str | None = None, environment: str | Environment | None = None, environment_variables: Dict | None = None, distribution: Dict | MpiDistribution | TensorFlowDistribution | PyTorchDistribution | RayDistribution | DistributionConfiguration | None = None, compute: str | None = None, inputs: Dict | None = None, outputs: Dict | None = None, instance_count: int | None = None, instance_type: str | None = None, locations: List[str] | None = None, docker_args: str | None = None, shm_size: str | None = None, timeout: int | None = None, code: PathLike | str | None = None, identity: ManagedIdentityConfiguration | AmlTokenConfiguration | UserIdentityConfiguration | None = None, is_deterministic: bool = True, services: Dict[str, JobService | JupyterLabJobService | SshJobService | TensorBoardJobService | VsCodeJobService] | None = None, job_tier: str | None = None, priority: str | None = None, **kwargs: Any) -> Command
Keyword-Only Parameters
Name | Description |
---|---|
name
|
The name of the Command job or component. |
description
|
The description of the Command. Defaults to None. |
tags
|
Tag dictionary. Tags can be added, removed, and updated. Defaults to None. |
properties
|
The job property dictionary. Defaults to None. |
display_name
|
The display name of the job. Defaults to a randomly generated name. |
command
|
The command to be executed. Defaults to None. |
experiment_name
|
The name of the experiment that the job will be created under. Defaults to current directory name. |
environment
|
The environment that the job will run in. |
environment_variables
|
A dictionary of environment variable names and values. These environment variables are set on the process where user script is being executed. Defaults to None. |
distribution
|
Optional[Union[dict, PyTorchDistribution, MpiDistribution, TensorFlowDistribution, RayDistribution]]
The configuration for distributed jobs. Defaults to None. |
compute
|
The compute target the job will run on. Defaults to default compute. |
inputs
|
A mapping of input names to input data sources used in the job. Defaults to None. |
outputs
|
A mapping of output names to output data sources used in the job. Defaults to None. |
instance_count
|
The number of instances or nodes to be used by the compute target. Defaults to 1. |
instance_type
|
The type of VM to be used by the compute target. |
locations
|
The list of locations where the job will run. |
docker_args
|
Extra arguments to pass to the Docker run command. This would override any parameters that have already been set by the system, or in this section. This parameter is only supported for Azure ML compute types. Defaults to None. |
shm_size
|
The size of the Docker container's shared memory block. This should be in the format of (number)(unit) where the number has to be greater than 0 and the unit can be one of b(bytes), k(kilobytes), m(megabytes), or g(gigabytes). |
timeout
|
The number, in seconds, after which the job will be cancelled. |
code
|
The source code to run the job. Can be a local path or "http:", "https:", or "azureml:" url pointing to a remote location. |
identity
|
The identity that the command job will use while running on compute. |
is_deterministic
|
Specifies whether the Command will return the same output given the same input. Defaults to True. When True, if a Command Component is deterministic and has been run before in the current workspace with the same input and settings, it will reuse results from a previously submitted job when used as a node or step in a pipeline. In that scenario, no compute resources will be used. Default value: True
|
services
|
Optional[dict[str, Union[JobService, JupyterLabJobService, SshJobService, TensorBoardJobService, VsCodeJobService]]]
The interactive services for the node. Defaults to None. This is an experimental parameter, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information. |
job_tier
|
The job tier. Accepted values are "Spot", "Basic", "Standard", or "Premium". |
priority
|
The priority of the job on the compute. Accepted values are "low", "medium", and "high". Defaults to "medium". |
Returns
Type | Description |
---|---|
A Command object. |
Examples
Creating a Command Job using the command() builder method.
from azure.ai.ml import Input, Output, command
train_func = command(
environment="AzureML-sklearn-1.0-ubuntu20.04-py38-cpu:33",
command='echo "hello world"',
distribution={"type": "Pytorch", "process_count_per_instance": 2},
inputs={
"training_data": Input(type="uri_folder"),
"max_epochs": 20,
"learning_rate": 1.8,
"learning_rate_schedule": "time-based",
},
outputs={"model_output": Output(type="uri_folder")},
)
load_batch_deployment
Construct a batch deployment object from yaml file.
load_batch_deployment(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> BatchDeployment
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a batch deployment object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed batch deployment object. |
load_batch_endpoint
Construct a batch endpoint object from yaml file.
load_batch_endpoint(source: str | PathLike | IO, relative_origin: str | None = None, *, params_override: List[Dict] | None = None, **kwargs: Any) -> BatchEndpoint
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a batch endpoint object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. Default value: None
|
Keyword-Only Parameters
Name | Description |
---|---|
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed batch endpoint object. |
load_capability_host
Constructs a CapabilityHost object from a YAML file.
load_capability_host(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> CapabilityHost
Parameters
Name | Description |
---|---|
source
Required
|
A path to a local YAML file or an already-open file object containing a capabilityhost configuration. If the source is a path, it will be opened and read. If the source is an open file, the file will be read directly. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The root directory for the YAML. This directory will be used as the origin for deducing the relative locations of files referenced in the parsed YAML. Defaults to the same directory as source if source is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Loaded CapabilityHost object. |
Exceptions
Type | Description |
---|---|
Raised if CapabilityHost cannot be successfully validated. Details will be provided in the error message. |
Examples
Loading a capabilityhost from a YAML config file.
from azure.ai.ml import load_capability_host
capability_host = load_capability_host(
source="./sdk/ml/azure-ai-ml/tests/test_configs/workspace/ai_workspaces/test_capability_host_hub.yml"
)
load_component
Load component from local or remote to a component function.
load_component(source: PathLike | str | IO | None = None, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> CommandComponent | ParallelComponent | PipelineComponent
Parameters
Name | Description |
---|---|
source
|
The local yaml source of a component. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. Default value: None
|
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
A Component object |
Examples
Loading a Component object from a YAML file, overriding its version to "1.0.2", and registering it remotely.
from azure.ai.ml import load_component
component = load_component(
source="./sdk/ml/azure-ai-ml/tests/test_configs/components/helloworld_component.yml",
params_override=[{"version": "1.0.2"}],
)
registered_component = ml_client.components.create_or_update(component)
load_compute
Construct a compute object from a yaml file.
load_compute(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict[str, str]] | None = None, **kwargs: Any) -> Compute
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a compute. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Optional parameters to override in the loaded yaml. |
Returns
Type | Description |
---|---|
Loaded compute object. |
Examples
Loading a Compute object from a YAML file and overriding its description.
from azure.ai.ml import load_compute
compute = load_compute(
"../tests/test_configs/compute/compute-vm.yaml",
params_override=[{"description": "loaded from compute-vm.yaml"}],
)
load_connection
Construct a connection object from yaml file.
load_connection(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> WorkspaceConnection
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a connection object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
<xref:Connection>
|
Constructed connection object. |
load_data
Construct a data object from yaml file.
load_data(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Data
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a data object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed Data or DataImport object. |
Exceptions
Type | Description |
---|---|
Raised if Data cannot be successfully validated. Details will be provided in the error message. |
load_datastore
Construct a datastore object from a yaml file.
load_datastore(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Datastore
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a datastore. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Loaded datastore object. |
Exceptions
Type | Description |
---|---|
Raised if Datastore cannot be successfully validated. Details will be provided in the error message. |
load_environment
Construct a environment object from yaml file.
load_environment(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Environment
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of an environment. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed environment object. |
Exceptions
Type | Description |
---|---|
Raised if Environment cannot be successfully validated. Details will be provided in the error message. |
load_feature_set
Construct a FeatureSet object from yaml file.
load_feature_set(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> FeatureSet
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a FeatureSet object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed FeatureSet object. |
Exceptions
Type | Description |
---|---|
Raised if FeatureSet cannot be successfully validated. Details will be provided in the error message. |
load_feature_store
Load a feature store object from a yaml file.
load_feature_store(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> FeatureStore
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a feature store. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Loaded feature store object. |
load_feature_store_entity
Construct a FeatureStoreEntity object from yaml file.
load_feature_store_entity(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> FeatureStoreEntity
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a FeatureStoreEntity object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed FeatureStoreEntity object. |
Exceptions
Type | Description |
---|---|
Raised if FeatureStoreEntity cannot be successfully validated. Details will be provided in the error message. |
load_index
Note
This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Constructs a Index object from a YAML file.
load_index(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Index
Parameters
Name | Description |
---|---|
source
Required
|
A path to a local YAML file or an already-open file object containing an index configuration. If the source is a path, it will be opened and read. If the source is an open file, the file will be read directly. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The root directory for the YAML. This directory will be used as the origin for deducing the relative locations of files referenced in the parsed YAML. Defaults to the same directory as source if source is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
A loaded Index object. |
Exceptions
Type | Description |
---|---|
Raised if Index cannot be successfully validated. Details will be provided in the error message. |
load_job
Constructs a Job object from a YAML file.
load_job(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Job
Parameters
Name | Description |
---|---|
source
Required
|
A path to a local YAML file or an already-open file object containing a job configuration. If the source is a path, it will be opened and read. If the source is an open file, the file will be read directly. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The root directory for the YAML. This directory will be used as the origin for deducing the relative locations of files referenced in the parsed YAML. Defaults to the same directory as source if source is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
A loaded Job object. |
Exceptions
Type | Description |
---|---|
Raised if Job cannot be successfully validated. Details will be provided in the error message. |
Examples
Loading a Job from a YAML config file.
from azure.ai.ml import load_job
job = load_job(source="./sdk/ml/azure-ai-ml/tests/test_configs/command_job/command_job_test_local_env.yml")
load_marketplace_subscription
Note
This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
load_marketplace_subscription(source: str | PathLike | IO, *, relative_origin: str | None = None, **kwargs: Any) -> MarketplaceSubscription
Keyword-Only Parameters
Name | Description |
---|---|
name
|
The name of the Command job or component. |
description
|
The description of the Command. Defaults to None. |
tags
|
Tag dictionary. Tags can be added, removed, and updated. Defaults to None. |
properties
|
The job property dictionary. Defaults to None. |
display_name
|
The display name of the job. Defaults to a randomly generated name. |
command
|
The command to be executed. Defaults to None. |
experiment_name
|
The name of the experiment that the job will be created under. Defaults to current directory name. |
environment
|
The environment that the job will run in. |
environment_variables
|
A dictionary of environment variable names and values. These environment variables are set on the process where user script is being executed. Defaults to None. |
distribution
|
Optional[Union[dict, PyTorchDistribution, MpiDistribution, TensorFlowDistribution, RayDistribution]]
The configuration for distributed jobs. Defaults to None. |
compute
|
The compute target the job will run on. Defaults to default compute. |
inputs
|
A mapping of input names to input data sources used in the job. Defaults to None. |
outputs
|
A mapping of output names to output data sources used in the job. Defaults to None. |
instance_count
|
The number of instances or nodes to be used by the compute target. Defaults to 1. |
instance_type
|
The type of VM to be used by the compute target. |
locations
|
The list of locations where the job will run. |
docker_args
|
Extra arguments to pass to the Docker run command. This would override any parameters that have already been set by the system, or in this section. This parameter is only supported for Azure ML compute types. Defaults to None. |
shm_size
|
The size of the Docker container's shared memory block. This should be in the format of (number)(unit) where the number has to be greater than 0 and the unit can be one of b(bytes), k(kilobytes), m(megabytes), or g(gigabytes). |
timeout
|
The number, in seconds, after which the job will be cancelled. |
code
|
The source code to run the job. Can be a local path or "http:", "https:", or "azureml:" url pointing to a remote location. |
identity
|
The identity that the command job will use while running on compute. |
is_deterministic
|
Specifies whether the Command will return the same output given the same input. Defaults to True. When True, if a Command Component is deterministic and has been run before in the current workspace with the same input and settings, it will reuse results from a previously submitted job when used as a node or step in a pipeline. In that scenario, no compute resources will be used. Default value: True
|
services
|
Optional[dict[str, Union[JobService, JupyterLabJobService, SshJobService, TensorBoardJobService, VsCodeJobService]]]
The interactive services for the node. Defaults to None. This is an experimental parameter, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information. |
job_tier
|
The job tier. Accepted values are "Spot", "Basic", "Standard", or "Premium". |
priority
|
The priority of the job on the compute. Accepted values are "low", "medium", and "high". Defaults to "medium". |
load_model
Constructs a Model object from a YAML file.
load_model(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Model
Parameters
Name | Description |
---|---|
source
Required
|
A path to a local YAML file or an already-open file object containing a job configuration. If the source is a path, it will be opened and read. If the source is an open file, the file will be read directly. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The root directory for the YAML. This directory will be used as the origin for deducing the relative locations of files referenced in the parsed YAML. Defaults to the same directory as source if source is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
A loaded Model object. |
Exceptions
Type | Description |
---|---|
Raised if Job cannot be successfully validated. Details will be provided in the error message. |
Examples
Loading a Model from a YAML config file, overriding the name and version parameters.
from azure.ai.ml import load_model
model = load_model(
source="./sdk/ml/azure-ai-ml/tests/test_configs/model/model_with_stage.yml",
params_override=[{"name": "new_model_name"}, {"version": "1"}],
)
load_model_package
Note
This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Constructs a ModelPackage object from a YAML file.
load_model_package(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> ModelPackage
Parameters
Name | Description |
---|---|
source
Required
|
A path to a local YAML file or an already-open file object containing a job configuration. If the source is a path, it will be opened and read. If the source is an open file, the file will be read directly. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The root directory for the YAML. This directory will be used as the origin for deducing the relative locations of files referenced in the parsed YAML. Defaults to the same directory as source if source is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
A loaded ModelPackage object. |
Exceptions
Type | Description |
---|---|
Raised if Job cannot be successfully validated. Details will be provided in the error message. |
Examples
Loading a ModelPackage from a YAML config file.
from azure.ai.ml import load_model_package
model_package = load_model_package(
"./sdk/ml/azure-ai-ml/tests/test_configs/model_package/model_package_simple.yml"
)
load_online_deployment
Construct a online deployment object from yaml file.
load_online_deployment(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> OnlineDeployment
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of an online deployment object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed online deployment object. |
Exceptions
Type | Description |
---|---|
Raised if Online Deployment cannot be successfully validated. Details will be provided in the error message. |
load_online_endpoint
Construct a online endpoint object from yaml file.
load_online_endpoint(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> OnlineEndpoint
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of an online endpoint object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Constructed online endpoint object. |
Exceptions
Type | Description |
---|---|
Raised if Online Endpoint cannot be successfully validated. Details will be provided in the error message. |
load_registry
Load a registry object from a yaml file.
load_registry(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Registry
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a registry. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Loaded registry object. |
load_serverless_endpoint
Note
This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
load_serverless_endpoint(source: str | PathLike | IO, *, relative_origin: str | None = None, **kwargs: Any) -> ServerlessEndpoint
Keyword-Only Parameters
Name | Description |
---|---|
name
|
The name of the Command job or component. |
description
|
The description of the Command. Defaults to None. |
tags
|
Tag dictionary. Tags can be added, removed, and updated. Defaults to None. |
properties
|
The job property dictionary. Defaults to None. |
display_name
|
The display name of the job. Defaults to a randomly generated name. |
command
|
The command to be executed. Defaults to None. |
experiment_name
|
The name of the experiment that the job will be created under. Defaults to current directory name. |
environment
|
The environment that the job will run in. |
environment_variables
|
A dictionary of environment variable names and values. These environment variables are set on the process where user script is being executed. Defaults to None. |
distribution
|
Optional[Union[dict, PyTorchDistribution, MpiDistribution, TensorFlowDistribution, RayDistribution]]
The configuration for distributed jobs. Defaults to None. |
compute
|
The compute target the job will run on. Defaults to default compute. |
inputs
|
A mapping of input names to input data sources used in the job. Defaults to None. |
outputs
|
A mapping of output names to output data sources used in the job. Defaults to None. |
instance_count
|
The number of instances or nodes to be used by the compute target. Defaults to 1. |
instance_type
|
The type of VM to be used by the compute target. |
locations
|
The list of locations where the job will run. |
docker_args
|
Extra arguments to pass to the Docker run command. This would override any parameters that have already been set by the system, or in this section. This parameter is only supported for Azure ML compute types. Defaults to None. |
shm_size
|
The size of the Docker container's shared memory block. This should be in the format of (number)(unit) where the number has to be greater than 0 and the unit can be one of b(bytes), k(kilobytes), m(megabytes), or g(gigabytes). |
timeout
|
The number, in seconds, after which the job will be cancelled. |
code
|
The source code to run the job. Can be a local path or "http:", "https:", or "azureml:" url pointing to a remote location. |
identity
|
The identity that the command job will use while running on compute. |
is_deterministic
|
Specifies whether the Command will return the same output given the same input. Defaults to True. When True, if a Command Component is deterministic and has been run before in the current workspace with the same input and settings, it will reuse results from a previously submitted job when used as a node or step in a pipeline. In that scenario, no compute resources will be used. Default value: True
|
services
|
Optional[dict[str, Union[JobService, JupyterLabJobService, SshJobService, TensorBoardJobService, VsCodeJobService]]]
The interactive services for the node. Defaults to None. This is an experimental parameter, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information. |
job_tier
|
The job tier. Accepted values are "Spot", "Basic", "Standard", or "Premium". |
priority
|
The priority of the job on the compute. Accepted values are "low", "medium", and "high". Defaults to "medium". |
load_workspace
Load a workspace object from a yaml file. This includes workspace sub-classes like hubs and projects.
load_workspace(source: str | PathLike | IO, *, relative_origin: str | None = None, params_override: List[Dict] | None = None, **kwargs: Any) -> Workspace
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a workspace. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
params_override
|
Fields to overwrite on top of the yaml file. Format is [{"field1": "value1"}, {"field2": "value2"}] |
Returns
Type | Description |
---|---|
Loaded workspace object. |
Examples
Loading a Workspace from a YAML config file.
from azure.ai.ml import load_workspace
ws = load_workspace(
"../tests/test_configs/workspace/workspace_min.yaml",
params_override=[{"description": "loaded from workspace_min.yaml"}],
)
load_workspace_connection
Deprecated - use 'load_connection' instead. Construct a connection object from yaml file.
load_workspace_connection(source: str | PathLike | IO, *, relative_origin: str | None = None, **kwargs: Any) -> WorkspaceConnection
Parameters
Name | Description |
---|---|
source
Required
|
The local yaml source of a connection object. Must be either a path to a local file, or an already-open file. If the source is a path, it will be open and read. An exception is raised if the file does not exist. If the source is an open file, the file will be read directly, and an exception is raised if the file is not readable. |
Keyword-Only Parameters
Name | Description |
---|---|
relative_origin
|
The origin to be used when deducing the relative locations of files referenced in the parsed yaml. Defaults to the inputted source's directory if it is a file or file path input. Defaults to "./" if the source is a stream input with no name value. |
Returns
Type | Description |
---|---|
<xref:Connection>
|
Constructed connection object. |
spark
Creates a Spark object which can be used inside a dsl.pipeline function or used as a standalone Spark job.
spark(*, experiment_name: str | None = None, name: str | None = None, display_name: str | None = None, description: str | None = None, tags: Dict | None = None, code: PathLike | str | None = None, entry: Dict[str, str] | SparkJobEntry | None = None, py_files: List[str] | None = None, jars: List[str] | None = None, files: List[str] | None = None, archives: List[str] | None = None, identity: Dict[str, str] | ManagedIdentity | AmlToken | UserIdentity | None = None, driver_cores: int | None = None, driver_memory: str | None = None, executor_cores: int | None = None, executor_memory: str | None = None, executor_instances: int | None = None, dynamic_allocation_enabled: bool | None = None, dynamic_allocation_min_executors: int | None = None, dynamic_allocation_max_executors: int | None = None, conf: Dict[str, str] | None = None, environment: str | Environment | None = None, inputs: Dict | None = None, outputs: Dict | None = None, args: str | None = None, compute: str | None = None, resources: Dict | SparkResourceConfiguration | None = None, **kwargs: Any) -> Spark
Keyword-Only Parameters
Name | Description |
---|---|
experiment_name
|
The name of the experiment the job will be created under. |
name
|
The name of the job. |
display_name
|
The job display name. |
description
|
The description of the job. Defaults to None. |
tags
|
The dictionary of tags for the job. Tags can be added, removed, and updated. Defaults to None. |
code
|
The source code to run the job. Can be a local path or "http:", "https:", or "azureml:" url pointing to a remote location. |
entry
|
The file or class entry point. |
py_files
|
The list of .zip, .egg or .py files to place on the PYTHONPATH for Python apps. Defaults to None. |
jars
|
The list of .JAR files to include on the driver and executor classpaths. Defaults to None. |
files
|
The list of files to be placed in the working directory of each executor. Defaults to None. |
archives
|
The list of archives to be extracted into the working directory of each executor. Defaults to None. |
identity
|
Optional[Union[ dict[str, str], ManagedIdentityConfiguration, AmlTokenConfiguration, UserIdentityConfiguration]]
The identity that the Spark job will use while running on compute. |
driver_cores
|
The number of cores to use for the driver process, only in cluster mode. |
driver_memory
|
The amount of memory to use for the driver process, formatted as strings with a size unit suffix ("k", "m", "g" or "t") (e.g. "512m", "2g"). |
executor_cores
|
The number of cores to use on each executor. |
executor_memory
|
The amount of memory to use per executor process, formatted as strings with a size unit suffix ("k", "m", "g" or "t") (e.g. "512m", "2g"). |
executor_instances
|
The initial number of executors. |
dynamic_allocation_enabled
|
Whether to use dynamic resource allocation, which scales the number of executors registered with this application up and down based on the workload. |
dynamic_allocation_min_executors
|
The lower bound for the number of executors if dynamic allocation is enabled. |
dynamic_allocation_max_executors
|
The upper bound for the number of executors if dynamic allocation is enabled. |
conf
|
A dictionary with pre-defined Spark configurations key and values. Defaults to None. |
environment
|
The Azure ML environment to run the job in. |
inputs
|
A mapping of input names to input data used in the job. Defaults to None. |
outputs
|
A mapping of output names to output data used in the job. Defaults to None. |
args
|
The arguments for the job. |
compute
|
The compute resource the job runs on. |
resources
|
The compute resource configuration for the job. |
Returns
Type | Description |
---|---|
A Spark object. |
Examples
Building a Spark pipeline using the DSL pipeline decorator
from azure.ai.ml import Input, Output, dsl, spark
from azure.ai.ml.constants import AssetTypes, InputOutputModes
# define the spark task
first_step = spark(
code="/src",
entry={"file": "add_greeting_column.py"},
py_files=["utils.zip"],
files=["my_files.txt"],
driver_cores=2,
driver_memory="1g",
executor_cores=1,
executor_memory="1g",
executor_instances=1,
inputs=dict(
file_input=Input(path="/dataset/iris.csv", type=AssetTypes.URI_FILE, mode=InputOutputModes.DIRECT)
),
args="--file_input ${{inputs.file_input}}",
resources={"instance_type": "standard_e4s_v3", "runtime_version": "3.3.0"},
)
second_step = spark(
code="/src",
entry={"file": "count_by_row.py"},
jars=["scala_project.jar"],
files=["my_files.txt"],
driver_cores=2,
driver_memory="1g",
executor_cores=1,
executor_memory="1g",
executor_instances=1,
inputs=dict(
file_input=Input(path="/dataset/iris.csv", type=AssetTypes.URI_FILE, mode=InputOutputModes.DIRECT)
),
outputs=dict(output=Output(type="uri_folder", mode=InputOutputModes.DIRECT)),
args="--file_input ${{inputs.file_input}} --output ${{outputs.output}}",
resources={"instance_type": "standard_e4s_v3", "runtime_version": "3.3.0"},
)
# Define pipeline
@dsl.pipeline(description="submit a pipeline with spark job")
def spark_pipeline_from_builder(data):
add_greeting_column = first_step(file_input=data)
count_by_row = second_step(file_input=data)
return {"output": count_by_row.outputs.output}
pipeline = spark_pipeline_from_builder(
data=Input(path="/dataset/iris.csv", type=AssetTypes.URI_FILE, mode=InputOutputModes.DIRECT),
)
Azure SDK for Python