automl Package
Contains automated machine learning classes for Azure Machine Learning SDKv2.
Main areas include managing AutoML tasks.
Classes
ClassificationJob |
Configuration for AutoML Classification Job. Initialize a new AutoML Classification task. |
ColumnTransformer |
Column transformer settings. |
ForecastingJob |
Configuration for AutoML Forecasting Task. Initialize a new AutoML Forecasting task. |
ForecastingSettings |
Forecasting settings for an AutoML Job. |
ImageClassificationJob |
Configuration for AutoML multi-class Image Classification job. |
ImageClassificationMultilabelJob |
Configuration for AutoML multi-label Image Classification job. |
ImageClassificationSearchSpace |
Search space for AutoML Image Classification and Image Classification Multilabel tasks. |
ImageInstanceSegmentationJob |
Configuration for AutoML Image Instance Segmentation job. |
ImageLimitSettings |
Limit settings for AutoML Image Verticals. ImageLimitSettings is a class that contains the following parameters: max_concurrent_trials, max_trials, and timeout_minutes. This is an optional configuration method to configure limits parameters such as timeouts etc. Note The number of concurrent runs is gated on the resources available in the specified compute target. Ensure that the compute target has the available resources for the desired concurrency. Tip It's a good practice to match max_concurrent_trials count with the number of nodes in the cluster. For example, if you have a cluster with 4 nodes, set max_concurrent_trials to 4. |
ImageModelSettingsClassification |
Model settings for AutoML Image Classification tasks. |
ImageModelSettingsObjectDetection |
Model settings for AutoML Image Object Detection Task. Defining the automl image object detection or instance segmentation model settings.
|
ImageObjectDetectionJob |
Configuration for AutoML Image Object Detection job. |
ImageObjectDetectionSearchSpace |
Search space for AutoML Image Object Detection and Image Instance Segmentation tasks. |
ImageSweepSettings |
Sweep settings for all AutoML Image Verticals. ] :keyword early_termination: Type of early termination policy. :paramtype early_termination: Union[ ~azure.mgmt.machinelearningservices.models.BanditPolicy, ~azure.mgmt.machinelearningservices.models.MedianStoppingPolicy, ~azure.mgmt.machinelearningservices.models.TruncationSelectionPolicy ] |
NlpFeaturizationSettings |
Featurization settings for all AutoML NLP Verticals. |
NlpFixedParameters |
Configuration of fixed parameters for all candidates of an AutoML NLP Job |
NlpLimitSettings |
Limit settings for all AutoML NLP Verticals. |
NlpSearchSpace |
Search space for AutoML NLP tasks. |
NlpSweepSettings |
Sweep settings for all AutoML NLP tasks. |
RegressionJob |
Configuration for AutoML Regression Job. Initialize a new AutoML Regression task. |
SearchSpace |
SearchSpace class for AutoML verticals. |
StackEnsembleSettings |
Advance setting to customize StackEnsemble run. |
TabularFeaturizationSettings |
Featurization settings for an AutoML Job. |
TabularLimitSettings |
Limit settings for a AutoML Table Verticals. |
TextClassificationJob |
Configuration for AutoML Text Classification Job. |
TextClassificationMultilabelJob |
Configuration for AutoML Text Classification Multilabel Job. |
TextNerJob |
Configuration for AutoML Text NER Job. |
TrainingSettings |
TrainingSettings class for Azure Machine Learning. TrainingSettings class for Azure Machine Learning. |
Enums
BlockedTransformers |
Enum for all classification models supported by AutoML. |
ClassificationModels |
Enum for all classification models supported by AutoML. |
ClassificationMultilabelPrimaryMetrics |
Primary metrics for classification multilabel tasks. |
ClassificationPrimaryMetrics |
Primary metrics for classification tasks. |
FeaturizationMode |
Featurization mode - determines data featurization mode. |
ForecastHorizonMode |
Enum to determine forecast horizon selection mode. |
ForecastingModels |
Enum for all forecasting models supported by AutoML. |
ForecastingPrimaryMetrics |
Primary metrics for Forecasting task. |
InstanceSegmentationPrimaryMetrics |
Primary metrics for InstanceSegmentation tasks. |
LearningRateScheduler |
Learning rate scheduler enum. |
LogTrainingMetrics | |
LogValidationLoss | |
NCrossValidationsMode |
Determines how N-Cross validations value is determined. |
ObjectDetectionPrimaryMetrics |
Primary metrics for Image ObjectDetection task. |
RegressionModels |
Enum for all Regression models supported by AutoML. |
RegressionPrimaryMetrics |
Primary metrics for Regression task. |
SamplingAlgorithmType | |
ShortSeriesHandlingConfiguration |
The parameter defining how if AutoML should handle short time series. |
StochasticOptimizer |
Stochastic optimizer for image models. |
TargetAggregationFunction |
Target aggregate function. |
TargetLagsMode |
Target lags selection modes. |
TargetRollingWindowSizeMode |
Target rolling windows size mode. |
UseStl |
Configure STL Decomposition of the time-series target column. |
ValidationMetricType |
Metric computation method to use for validation metrics in image tasks. |
Functions
classification
Function to create a ClassificationJob.
A classification job is used to train a model that best predict the class of a data sample. Various models are trained using the training data. The model with the best performance on the validation data based on the primary metric is selected as the final model.
classification(*, training_data: Input, target_column_name: str, primary_metric: str | None = None, enable_model_explainability: bool | None = None, weight_column_name: str | None = None, validation_data: Input | None = None, validation_data_size: float | None = None, n_cross_validations: str | int | None = None, cv_split_column_names: List[str] | None = None, test_data: Input | None = None, test_data_size: float | None = None, **kwargs) -> ClassificationJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
The training data to be used within the experiment. It should contain both training features and a label column (optionally a sample weights column). |
target_column_name
|
The name of the label column.
This parameter is applicable to |
primary_metric
|
The metric that Automated Machine Learning will optimize for model selection. Automated Machine Learning collects more metrics than it can optimize. For more information on how metrics are calculated, see https://docs.microsoft.com/azure/machine-learning/how-to-configure-auto-train#primary-metric. Acceptable values: accuracy, AUC_weighted, norm_macro_recall, average_precision_score_weighted, and precision_score_weighted Defaults to accuracy |
enable_model_explainability
|
Whether to enable explaining the best AutoML model at the end of all AutoML training iterations. The default is None. For more information, see Interpretability: model explanations in automated machine learning. |
weight_column_name
|
The name of the sample weight column. Automated ML supports a weighted column as an input, causing rows in the data to be weighted up or down. If the input data is from a pandas.DataFrame which doesn't have column names, column indices can be used instead, expressed as integers. This parameter is applicable to |
validation_data
|
The validation data to be used within the experiment. It should contain both training features and label column (optionally a sample weights column). Defaults to None |
validation_data_size
|
What fraction of the data to hold out for validation when user validation data is not specified. This should be between 0.0 and 1.0 non-inclusive. Specify For more information, see Configure data splits and cross-validation in automated machine learning. Defaults to None |
n_cross_validations
|
How many cross validations to perform when user validation data is not specified. Specify For more information, see Configure data splits and cross-validation in automated machine learning. Defaults to None |
cv_split_column_names
|
List of names of the columns that contain custom cross validation split. Each of the CV split columns represents one CV split where each row are either marked 1 for training or 0 for validation. Defaults to None |
test_data
|
The Model Test feature using test datasets or test data splits is a feature in Preview state and might change at any time. The test data to be used for a test run that will automatically be started after model training is complete. The test run will get predictions using the best model and will compute metrics given these predictions. If this parameter or the Defaults to None |
test_data_size
|
The Model Test feature using test datasets or test data splits is a feature in Preview state and might change at any time. What fraction of the training data to hold out for test data for a test run that will automatically be started after model training is complete. The test run will get predictions using the best model and will compute metrics given these predictions. This should be between 0.0 and 1.0 non-inclusive.
If For regression based tasks, random sampling is used. For classification tasks, stratified sampling is used. Forecasting does not currently support specifying a test dataset using a train/test split. If this parameter or the Defaults to None |
Returns
Type | Description |
---|---|
A job object that can be submitted to an Azure ML compute for execution. |
forecasting
Function to create a Forecasting job.
A forecasting task is used to predict target values for a future time period based on the historical data. Various models are trained using the training data. The model with the best performance on the validation data based on the primary metric is selected as the final model.
forecasting(*, training_data: Input, target_column_name: str, primary_metric: str | None = None, enable_model_explainability: bool | None = None, weight_column_name: str | None = None, validation_data: Input | None = None, validation_data_size: float | None = None, n_cross_validations: str | int | None = None, cv_split_column_names: List[str] | None = None, test_data: Input | None = None, test_data_size: float | None = None, forecasting_settings: ForecastingSettings | None = None, **kwargs) -> ForecastingJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
The training data to be used within the experiment. It should contain both training features and a label column (optionally a sample weights column). |
target_column_name
|
The name of the label column.
This parameter is applicable to |
primary_metric
|
The metric that Automated Machine Learning will optimize for model selection. Automated Machine Learning collects more metrics than it can optimize. For more information on how metrics are calculated, see https://docs.microsoft.com/azure/machine-learning/how-to-configure-auto-train#primary-metric. Acceptable values: r2_score, normalized_mean_absolute_error, normalized_root_mean_squared_error Defaults to normalized_root_mean_squared_error |
enable_model_explainability
|
Whether to enable explaining the best AutoML model at the end of all AutoML training iterations. The default is None. For more information, see Interpretability: model explanations in automated machine learning. |
weight_column_name
|
The name of the sample weight column. Automated ML supports a weighted column as an input, causing rows in the data to be weighted up or down. If the input data is from a pandas.DataFrame which doesn't have column names, column indices can be used instead, expressed as integers. This parameter is applicable to |
validation_data
|
The validation data to be used within the experiment. It should contain both training features and label column (optionally a sample weights column). Defaults to None |
validation_data_size
|
What fraction of the data to hold out for validation when user validation data is not specified. This should be between 0.0 and 1.0 non-inclusive. Specify For more information, see Configure data splits and cross-validation in automated machine learning. Defaults to None |
n_cross_validations
|
How many cross validations to perform when user validation data is not specified. Specify For more information, see Configure data splits and cross-validation in automated machine learning. Defaults to None |
cv_split_column_names
|
List of names of the columns that contain custom cross validation split. Each of the CV split columns represents one CV split where each row are either marked 1 for training or 0 for validation. Defaults to None |
test_data
|
The Model Test feature using test datasets or test data splits is a feature in Preview state and might change at any time. The test data to be used for a test run that will automatically be started after model training is complete. The test run will get predictions using the best model and will compute metrics given these predictions. If this parameter or the Defaults to None |
test_data_size
|
The Model Test feature using test datasets or test data splits is a feature in Preview state and might change at any time. What fraction of the training data to hold out for test data for a test run that will automatically be started after model training is complete. The test run will get predictions using the best model and will compute metrics given these predictions. This should be between 0.0 and 1.0 non-inclusive.
If For regression based tasks, random sampling is used. For classification tasks, stratified sampling is used. Forecasting does not currently support specifying a test dataset using a train/test split. If this parameter or the Defaults to None |
forecasting_settings
|
The settings for the forecasting task |
Returns
Type | Description |
---|---|
A job object that can be submitted to an Azure ML compute for execution. |
image_classification
Creates an object for AutoML Image multi-class Classification job.
image_classification(*, training_data: Input, target_column_name: str, primary_metric: str | ClassificationPrimaryMetrics | None = None, validation_data: Input | None = None, validation_data_size: float | None = None, **kwargs) -> ImageClassificationJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
<xref:azure.ai.ml.entities.Input>
The training data to be used within the experiment. |
target_column_name
|
The name of the label column.
This parameter is applicable to |
primary_metric
|
The metric that Automated Machine Learning will optimize for model selection. Automated Machine Learning collects more metrics than it can optimize. For more information on how metrics are calculated, see https://docs.microsoft.com/azure/machine-learning/how-to-configure-auto-train#primary-metric. Acceptable values: accuracy, AUC_weighted, norm_macro_recall, average_precision_score_weighted, and precision_score_weighted Defaults to accuracy. |
validation_data
|
Optional[<xref:azure.ai.ml.entities.Input>]
The validation data to be used within the experiment. |
validation_data_size
|
What fraction of the data to hold out for validation when user validation data is not specified. This should be between 0.0 and 1.0 non-inclusive. Specify Defaults to .2 |
Returns
Type | Description |
---|---|
Image classification job object that can be submitted to an Azure ML compute for execution. |
Examples
creating an automl image classification job
from azure.ai.ml import automl, Input
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.automl import ClassificationMultilabelPrimaryMetrics
image_classification_job = automl.image_classification(
experiment_name="my_experiment",
compute="my_compute",
training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
target_column_name="label",
primary_metric=ClassificationMultilabelPrimaryMetrics.ACCURACY,
tags={"my_custom_tag": "My custom value"},
)
image_classification_multilabel
Creates an object for AutoML Image multi-label Classification job.
image_classification_multilabel(*, training_data: Input, target_column_name: str, primary_metric: str | ClassificationMultilabelPrimaryMetrics | None = None, validation_data: Input | None = None, validation_data_size: float | None = None, **kwargs) -> ImageClassificationMultilabelJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
<xref:azure.ai.ml.entities.Input>
The training data to be used within the experiment. |
target_column_name
|
The name of the label column.
This parameter is applicable to |
primary_metric
|
The metric that Automated Machine Learning will optimize for model selection. Automated Machine Learning collects more metrics than it can optimize. For more information on how metrics are calculated, see https://docs.microsoft.com/azure/machine-learning/how-to-configure-auto-train#primary-metric. Acceptable values: accuracy, AUC_weighted, norm_macro_recall, average_precision_score_weighted, precision_score_weighted, and Iou Defaults to Iou. |
validation_data
|
Optional[<xref:azure.ai.ml.entities.Input>]
The validation data to be used within the experiment. |
validation_data_size
|
The fraction of the training data to hold out for validation when user does not provide the validation data. This should be between 0.0 and 1.0 non-inclusive. Specify Defaults to .2 |
Returns
Type | Description |
---|---|
Image multi-label classification job object that can be submitted to an Azure ML compute for execution. |
Examples
creating an automl image multilabel classification job
from azure.ai.ml import automl, Input
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.automl import ClassificationMultilabelPrimaryMetrics
image_classification_multilabel_job = automl.image_classification_multilabel(
experiment_name="my_experiment",
compute="my_compute",
training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
target_column_name="label",
primary_metric=ClassificationMultilabelPrimaryMetrics.IOU,
tags={"my_custom_tag": "My custom value"},
)
image_instance_segmentation
Creates an object for AutoML Image Instance Segmentation job.
image_instance_segmentation(*, training_data: Input, target_column_name: str, primary_metric: str | InstanceSegmentationPrimaryMetrics | None = None, validation_data: Input | None = None, validation_data_size: float | None = None, **kwargs) -> ImageInstanceSegmentationJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
<xref:azure.ai.ml.entities.Input>
The training data to be used within the experiment. |
target_column_name
|
The name of the label column.
This parameter is applicable to |
primary_metric
|
The metric that Automated Machine Learning will optimize for model selection. Automated Machine Learning collects more metrics than it can optimize. For more information on how metrics are calculated, see https://docs.microsoft.com/azure/machine-learning/how-to-configure-auto-train#primary-metric. Acceptable values: MeanAveragePrecision Defaults to MeanAveragePrecision. |
validation_data
|
Optional[<xref:azure.ai.ml.entities.Input>]
The validation data to be used within the experiment. |
validation_data_size
|
The fraction of the training data to hold out for validation when user does not provide the validation data. This should be between 0.0 and 1.0 non-inclusive. Specify Defaults to .2 |
Returns
Type | Description |
---|---|
Image instance segmentation job |
Examples
creating an automl image instance segmentation job
from azure.ai.ml import automl, Input
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.automl import InstanceSegmentationPrimaryMetrics
image_instance_segmentation_job = automl.image_instance_segmentation(
experiment_name="my_experiment",
compute="my_compute",
training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
target_column_name="label",
primary_metric=InstanceSegmentationPrimaryMetrics.MEAN_AVERAGE_PRECISION,
tags={"my_custom_tag": "My custom value"},
)
image_object_detection
Creates an object for AutoML Image Object Detection job.
image_object_detection(*, training_data: Input, target_column_name: str, primary_metric: str | ObjectDetectionPrimaryMetrics | None = None, validation_data: Input | None = None, validation_data_size: float | None = None, **kwargs) -> ImageObjectDetectionJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
<xref:azure.ai.ml.entities.Input>
The training data to be used within the experiment. |
target_column_name
|
The name of the label column.
This parameter is applicable to |
primary_metric
|
The metric that Automated Machine Learning will optimize for model selection. Automated Machine Learning collects more metrics than it can optimize. For more information on how metrics are calculated, see https://docs.microsoft.com/azure/machine-learning/how-to-configure-auto-train#primary-metric. Acceptable values: MeanAveragePrecision Defaults to MeanAveragePrecision. |
validation_data
|
Optional[<xref:azure.ai.ml.entities.Input>]
The validation data to be used within the experiment. |
validation_data_size
|
The fraction of the training data to hold out for validation when user does not provide the validation data. This should be between 0.0 and 1.0 non-inclusive. Specify Defaults to .2 |
Returns
Type | Description |
---|---|
Image object detection job object that can be submitted to an Azure ML compute for execution. |
Examples
creating an automl image object detection job
from azure.ai.ml import automl, Input
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.automl import ObjectDetectionPrimaryMetrics
image_object_detection_job = automl.image_object_detection(
experiment_name="my_experiment",
compute="my_compute",
training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
target_column_name="label",
primary_metric=ObjectDetectionPrimaryMetrics.MEAN_AVERAGE_PRECISION,
tags={"my_custom_tag": "My custom value"},
)
regression
Function to create a Regression Job.
A regression job is used to train a model to predict continuous values of a target variable from a dataset. Various models are trained using the training data. The model with the best performance on the validation data based on the primary metric is selected as the final model.
regression(*, training_data: Input, target_column_name: str, primary_metric: str | None = None, enable_model_explainability: bool | None = None, weight_column_name: str | None = None, validation_data: Input | None = None, validation_data_size: float | None = None, n_cross_validations: str | int | None = None, cv_split_column_names: List[str] | None = None, test_data: Input | None = None, test_data_size: float | None = None, **kwargs) -> RegressionJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
The training data to be used within the experiment. It should contain both training features and a label column (optionally a sample weights column). |
target_column_name
|
The name of the label column.
This parameter is applicable to |
primary_metric
|
The metric that Automated Machine Learning will optimize for model selection. Automated Machine Learning collects more metrics than it can optimize. For more information on how metrics are calculated, see https://docs.microsoft.com/azure/machine-learning/how-to-configure-auto-train#primary-metric. Acceptable values: spearman_correlation, r2_score, normalized_mean_absolute_error, normalized_root_mean_squared_error. Defaults to normalized_root_mean_squared_error |
enable_model_explainability
|
Whether to enable explaining the best AutoML model at the end of all AutoML training iterations. The default is None. For more information, see Interpretability: model explanations in automated machine learning. |
weight_column_name
|
The name of the sample weight column. Automated ML supports a weighted column as an input, causing rows in the data to be weighted up or down. If the input data is from a pandas.DataFrame which doesn't have column names, column indices can be used instead, expressed as integers. This parameter is applicable to |
validation_data
|
The validation data to be used within the experiment. It should contain both training features and label column (optionally a sample weights column). Defaults to None |
validation_data_size
|
What fraction of the data to hold out for validation when user validation data is not specified. This should be between 0.0 and 1.0 non-inclusive. Specify For more information, see Configure data splits and cross-validation in automated machine learning. Defaults to None |
n_cross_validations
|
How many cross validations to perform when user validation data is not specified. Specify For more information, see Configure data splits and cross-validation in automated machine learning. Defaults to None |
cv_split_column_names
|
List of names of the columns that contain custom cross validation split. Each of the CV split columns represents one CV split where each row are either marked 1 for training or 0 for validation. Defaults to None |
test_data
|
The Model Test feature using test datasets or test data splits is a feature in Preview state and might change at any time. The test data to be used for a test run that will automatically be started after model training is complete. The test run will get predictions using the best model and will compute metrics given these predictions. If this parameter or the Defaults to None |
test_data_size
|
The Model Test feature using test datasets or test data splits is a feature in Preview state and might change at any time. What fraction of the training data to hold out for test data for a test run that will automatically be started after model training is complete. The test run will get predictions using the best model and will compute metrics given these predictions. This should be between 0.0 and 1.0 non-inclusive.
If For regression based tasks, random sampling is used. For classification tasks, stratified sampling is used. Forecasting does not currently support specifying a test dataset using a train/test split. If this parameter or the Defaults to None |
Returns
Type | Description |
---|---|
A job object that can be submitted to an Azure ML compute for execution. |
text_classification
Function to create a TextClassificationJob.
A text classification job is used to train a model that can predict the class/category of a text data. Input training data should include a target column that classifies the text into exactly one class.
text_classification(*, training_data: Input, target_column_name: str, validation_data: Input, primary_metric: str | None = None, log_verbosity: str | None = None, **kwargs) -> TextClassificationJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
The training data to be used within the experiment. It should contain both training features and a target column. |
target_column_name
|
Name of the target column. |
validation_data
|
The validation data to be used within the experiment. It should contain both training features and a target column. |
primary_metric
|
Primary metric for the task. Acceptable values: accuracy, AUC_weighted, precision_score_weighted |
log_verbosity
|
Log verbosity level. |
Returns
Type | Description |
---|---|
The TextClassificationJob object. |
Examples
creating an automl text classification job
from azure.ai.ml import automl, Input
from azure.ai.ml.constants import AssetTypes
test_classification_job = automl.text_classification(
experiment_name="my_experiment",
compute="my_compute",
training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
target_column_name="Sentiment",
primary_metric="accuracy",
tags={"my_custom_tag": "My custom value"},
)
text_classification_multilabel
Function to create a TextClassificationMultilabelJob.
A text classification multilabel job is used to train a model that can predict the classes/categories of a text data. Input training data should include a target column that classifies the text into class(es). For more information on format of multilabel data, refer to: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-nlp-models#multi-label
text_classification_multilabel(*, training_data: Input, target_column_name: str, validation_data: Input, primary_metric: str | None = None, log_verbosity: str | None = None, **kwargs) -> TextClassificationMultilabelJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
The training data to be used within the experiment. It should contain both training features and a target column. |
target_column_name
|
Name of the target column. |
validation_data
|
The validation data to be used within the experiment. It should contain both training features and a target column. |
primary_metric
|
Primary metric for the task. Acceptable values: accuracy |
log_verbosity
|
Log verbosity level. |
Returns
Type | Description |
---|---|
The TextClassificationMultilabelJob object. |
Examples
creating an automl text multilabel classification job
from azure.ai.ml import automl, Input
from azure.ai.ml.constants import AssetTypes
text_classification_multilabel_job = automl.text_classification_multilabel(
experiment_name="my_experiment",
compute="my_compute",
training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
target_column_name="terms",
primary_metric="accuracy",
tags={"my_custom_tag": "My custom value"},
)
text_ner
Function to create a TextNerJob.
A text named entity recognition job is used to train a model that can predict the named entities in the text. Input training data should be a text file in CoNLL format. For more information on format of text NER data, refer to: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-auto-train-nlp-models#named-entity-recognition-ner
text_ner(*, training_data: Input, validation_data: Input, primary_metric: str | None = None, log_verbosity: str | None = None, **kwargs) -> TextNerJob
Keyword-Only Parameters
Name | Description |
---|---|
training_data
|
The training data to be used within the experiment. It should contain both training features and a target column. |
validation_data
|
The validation data to be used within the experiment. It should contain both training features and a target column. |
primary_metric
|
Primary metric for the task. Acceptable values: accuracy |
log_verbosity
|
Log verbosity level. |
Returns
Type | Description |
---|---|
The TextNerJob object. |
Examples
creating an automl text ner job
from azure.ai.ml import automl, Input
from azure.ai.ml.constants import AssetTypes
text_ner_job = automl.text_ner(
experiment_name="my_experiment",
compute="my_compute",
training_data=Input(type=AssetTypes.MLTABLE, path="./training-mltable-folder"),
validation_data=Input(type=AssetTypes.MLTABLE, path="./validation-mltable-folder"),
tags={"my_custom_tag": "My custom value"},
)
Azure SDK for Python