設定 AutoML 以使用 SDK 和 CLI 定型時間序列預測模型

發行項
10/16/2024

適用於：Azure CLI ml 延伸模組 v2 (目前)Python SDK azure-ai-ml v2 (目前)

Azure 中的自動化機器學習（AutoML）機器學習會使用標準機器學習模型以及已知的時間序列模型來建立預測。此方法包含目標變數的歷程記錄資訊，以及輸入數據中使用者提供的功能，以及自動設計的功能。模型搜尋演算法有助於識別具有最佳預測精確度的模型。如需詳細資訊，請參閱預測方法和模型掃掠和選取。

本文說明如何使用 Azure 機器學習 Python SDK]（/python/api/overview/azure/ai-ml-readme）設定 AutoML，以使用機器學習進行時間序列預測。此程式包括在預測作業中準備定型和設定時間序列參數的數據（類別參考）。接著，您可以使用元件和管線來定型、推斷和評估模型。

如需低程式代碼體驗，請參閱教學課程：使用自動化機器學習預測需求。此資源是一個在 Azure Machine Learning 工作室中使用 AutoML 的時間序列預測範例。

必要條件

Azure Machine Learning 工作區。若要建立工作區，請參閱建立工作區資源。
能夠啟動 AutoML 定型作業。如需詳細資訊，請參閱使用 Azure 機器學習 CLI 和 Python SDK 設定表格式數據的 AutoML 定型。

準備定型和驗證數據

AutoML 預測的輸入資料必須包含表格式格式的有效時間序列。每個變數都必須在資料表中有自己的對應資料行。 AutoML 至少需要兩個數據行：一個時間數據行，代表要預測數量的時間軸和目標數據行。其他資料行可以做為預測值。如需詳細資訊，請參閱 AutoML 如何使用您的數據。

重要

當您將模型定型以預測未來值時，請確定在針對預定的地平線執行預測時，也可以使用定型中使用的所有功能。

請考慮目前股價的功能，這可以大幅提高訓練精確度。如果您使用長線預測，則可能無法準確預測對應至未來時間序列點的未來股票值。這種方法可以降低模型精確度。

AutoML 預測作業會要求您的定型數據以 MLTable 物件表示。 MLTable物件會指定數據源和載入數據的步驟。如需詳細資訊和使用案例，請參閱 [使用數據表（how-to-mltable.md）。

在下列範例中，假設您的定型數據包含在本機目錄 ./train_data/timeseries_train.csv的 CSV 檔案中。

Python SDK
Azure CLI

您可以使用 mltable Python SDK 來建立 MLTable 物件，如下列範例所示：

import mltable

paths = [
    {'file': './train_data/timeseries_train.csv'}
]

train_table = mltable.from_delimited_files(paths)
train_table.save('./train_data')

此程式碼會建立新的檔案 ./train_data/MLTable，其中包含檔案格式和載入指示。

若要啟動定型作業，請使用 Python SDK 來定義輸入資料物件，如下所示：

from azure.ai.ml.constants import AssetTypes
from azure.ai.ml import Input

# Training MLTable defined locally, with local data to be uploaded
my_training_data_input = Input(
    type=AssetTypes.MLTABLE, path="./train_data"
)

您可以將下列 YAML 代碼段複製到新的檔案 ./train_data/MLTable，以定義新的MLTable物件：

$schema: https://azuremlschemas.azureedge.net/latest/MLTable.schema.json

type: mltable
paths:
    - file: ./timeseries_train.csv

transformations:
    - read_delimited:
        delimiter: ','
        encoding: ascii

使用指定的定型數據開始建置 AutoML 作業的 YAML 組態，如下列範例所示：

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: cli-v2-automl-forecasting-job
description: A time-series forecasting AutoML job
task: forecasting

# Training data MLTable for the AutoML job
training_data:
    path: "./train_data"
    type: mltable

validation_data:
    # Optional validation data

compute: # Compute for training job
primary_metric: # Primary metric  

target_column_name: # Target column name
n_cross_validations: # Cross validation setting

limits:
    # Limit settings

forecasting:
    # Forecasting specific settings

training:
    # Training settings

在本文的後續章節中，將更多詳細數據新增至此組態。在此範例中，位置為 ./automl-forecasting-job.yml。

您可以使用類似的方式指定驗證資料。建立 MLTable 物件並指定驗證數據輸入。或者，如果您未提供驗證資料，AutoML 會自動從定型資料建立交叉驗證分割，以用於模型選取。如需詳細資訊，請參閱以下資源：

建立計算以執行實驗

AutoML 使用 Azure Machine Learning Compute，其為完全受控的計算資源，可用來執行定型作業。下列範例會建立名為的 cpu-compute計算叢集。

Python SDK
Azure CLI

from azure.ai.ml.entities import AmlCompute

# specify aml compute name.
cpu_compute_target = "cpu-cluster"

try:
    ml_client.compute.get(cpu_compute_target)
except Exception:
    print("Creating a new cpu compute target...")
    compute = AmlCompute(
        name=cpu_compute_target, size="STANDARD_D2_V2", min_instances=0, max_instances=4
    )
    ml_client.compute.begin_create_or_update(compute).result()

您可以使用下列 Azure CLI 命令來建立名為 cpu-compute 的新計算：

az ml compute create -n cpu-compute --type amlcompute --min-instances 0 --max-instances 4

參考作業定義中的計算，如下所示：

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: cli-v2-automl-forecasting-job
description: A time-series forecasting AutoML job
task: forecasting

# Set training data MLTable for the AutoML job
training_data:
    path: "./train_data"
    type: mltable

# Set compute for the training job to use 
compute: azureml:cpu-compute

primary_metric: # Primary metric  

target_column_name: # Target column name
n_cross_validations: # Cross validation setting

limits:
    # Limit settings

forecasting:
    # Forecasting specific settings

training:
    # Training settings

設定實驗

下列範例示範如何設定實驗。

Python SDK
Azure CLI

您可以使用 AutoML Factory 函式在 Python SDK 中設定預測作業。下列範例示範如何藉由設定定型回合的主要計量和設定限制，以建立預測作業：

from azure.ai.ml import automl

# Set forecasting variables
# As needed, modify the variable values to run the snippet successfully
forecasting_job = automl.forecasting(
    compute="cpu-compute",
    experiment_name="sdk-v2-automl-forecasting-job",
    training_data=my_training_data_input,
    target_column_name=target_column_name,
    primary_metric="normalized_root_mean_squared_error",
    n_cross_validations="auto",
)

# Set optional limits
forecasting_job.set_limits(
    timeout_minutes=120,
    trial_timeout_minutes=30,
    max_concurrent_trials=4,
)

設定 AutoML 工作的一般屬性，包括：

主要計量
定型資料中目標資料行的名稱
交叉驗證設定
工作的資源限制

如需詳細資訊，請參閱預測命令作業 YAML 架構、定型參數和限制。

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: cli-v2-automl-forecasting-job
description: A time-series forecasting AutoML job
task: forecasting

training_data:
    path: "./train_data"
    type: mltable

compute: azureml:cpu-compute

# Settings for primary metric, target/label column name, cross validation
primary_metric: normalized_root_mean_squared_error
target_column_name: <target_column_name>
n_cross_validations: auto

# Settings for training job limits on time, concurrency, and others
limits:
    timeout_minutes: 120
    trial_timeout_minutes: 30
    max_concurrent_trials: 4

forecasting:
    # Forecasting specific settings

training:
    # Training settings

預測工作設定

預測工作有許多專屬於預測的設定。這些設定中最基本的是定型資料和預測範圍中時間資料行的名稱。

Python SDK
Azure CLI

使用 ForecastingJob 方法來設定這些設定：

# Forecasting specific configuration
forecasting_job.set_forecast_settings(
    time_column_name=time_column_name,
    forecast_horizon=24
)

這些設定是在 forecasting 作業 YAML 組態的區段中設定：

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: cli-v2-automl-forecasting-job
description: A time-series forecasting AutoML job
task: forecasting

training_data:
    path: "./train_data"
    type: mltable

compute: azureml:cpu-compute

primary_metric: normalized_root_mean_squared_error
target_column_name: <target_column_name>
n_cross_validations: auto

limits:
    timeout_minutes: 120
    trial_timeout_minutes: 30
    max_concurrent_trials: 4

# Forecasting specific settings
# Set the horizon to 24 for this example, the horizon generally depends on the business scenario
forecasting:
    time_column_name: <time_column_name>
    forecast_horizon: 24

training:
    # Training settings

時間資料行名稱是必要設定。您通常應該根據預測案例來設定預測範圍。如果您的資料包含多個時間序列，您可以指定時間序列識別碼資料行的名稱。當這些數據行分組時，它們會定義個別的數位。例如，假設您有來自不同商店和品牌每小時銷售的數據。下列範例示範如何設定假設資料包含名為 store 和 brand 的資料行的時間序列識別碼資料行：

Python SDK
Azure CLI

# Forecasting specific configuration
# Add time series IDs for store and brand
forecasting_job.set_forecast_settings(
    ...,  # Other settings
    time_series_id_column_names=['store', 'brand']
)

# Forecasting specific settings
# Add time series IDs for store and brand
forecasting:
    # Other settings
    time_series_id_column_names: ["store", "brand"]

如果未指定時間序列識別碼資料行，AutoML 會嘗試自動偵測資料中的時間序列識別碼資料行。

其他設定為選擇性設定，並在下一節中檢閱。

選擇性預測作業設定

更多選擇性的設定可用於預測作業，例如啟用深度學習和指定目標移動時段彙總。參考檔中提供完整的參數清單。

模型搜尋設定

有兩個選擇性設定可控制 AutoML 搜尋最佳模型的模型空間：allowed_training_algorithms 和 blocked_training_algorithms。若要將搜尋空間限制為一組指定的模型類別，請使用 allowed_training_algorithms 參數，如下列範例所示：

Python SDK
Azure CLI

# Only search ExponentialSmoothing and ElasticNet models
forecasting_job.set_training(
    allowed_training_algorithms=["ExponentialSmoothing", "ElasticNet"]
)

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: cli-v2-automl-forecasting-job
description: A time-series forecasting AutoML job
task: forecasting

training_data:
    path: "./train_data"
    type: mltable

compute: azureml:cpu-compute

primary_metric: normalized_root_mean_squared_error
target_column_name: <target_column_name>
n_cross_validations: auto

limits:
    timeout_minutes: 120
    trial_timeout_minutes: 30
    max_concurrent_trials: 4

forecasting:
    time_column_name: <time_column_name>
    forecast_horizon: 24

# Training settings
# Only search ExponentialSmoothing and ElasticNet models
training:
    allowed_training_algorithms: ["ExponentialSmoothing", "ElasticNet"]
    # Other training settings

在此案例中，預測作業只會搜尋指數 Smoothing 和 Elastic Net 模型類別。若要從搜尋空間中移除一組指定的模型類別，請使用 blocked_training_algorithms ，如下列範例所示：

Python SDK
Azure CLI

# Search over all model classes except Prophet
forecasting_job.set_training(
    blocked_training_algorithms=["Prophet"]
)

# Training settings
# Search over all model classes except Prophet
training:
    blocked_training_algorithms: ["Prophet"]
    # other training settings

工作會搜尋除了 Prophet 以外的所有模型類別。如需 allowed_training_algorithms 和 blocked_training_algorithms 中接受的預測模型名稱清單，請參閱定型屬性。您可以將 allowed_training_algorithms 和 blocked_training_algorithms 之一套用至定型執行。

啟用深度學習神經網路的學習

AutoML 隨附名為的自定義深度神經網路（DNN）模型 TCNForecaster。此模型是時態卷積網路（TCN），可將常見的映射工作方法套用至時間序列模型化。一維的「原因」卷積形成網路的骨幹，讓模型在定型記錄中長期學習複雜的模式。如需詳細資訊，請參閱 TCNForecaster 簡介。

TCNForecaster 通常會在定型記錄中有數千個以上的觀察時，達到比標準時間序列模型更高的精確度。不過，由於其容量較高，因此定型和掃掠 TCNForecaster 模型也需要更長的時間。

您可以在 AutoML 中啟用 TCNForecaster，方法是在定型設定中設定 enable_dnn_training 旗標，如下所示：

Python SDK
Azure CLI

# Include TCNForecaster models in the model search
forecasting_job.set_training(
    enable_dnn_training=True
)

# Training settings
# Include TCNForecaster models in the model search
training:
    enable_dnn_training: true
    # Other training settings

根據預設，TCNForecaster 定型僅限於每個模型試用版單一計算節點和單一 GPU (如果有的話)。針對大型數據案例，建議將每個 TCNForecaster 試用版散發至多個核心/GPU 和節點。如需詳細資訊和程式碼範例，請參閱分散式定型。

若要為在 Azure Machine Learning 工作室中建立的 AutoML 實驗啟用 DNN，請參閱 Studio UI 操作說明中的工作類型設定。

注意

當您針對以 SDK 建立的實驗啟用 DNN 時，系統會停用最佳模型說明。
Azure Databricks 中起始的執行不支援在自動化機器學習中預測的 DNN 支援。
建議的方法是在啟用 DNN 定型時使用 GPU 計算類型。

延隔和滾動時段功能

目標最近的值通常是預測模型中有影響力的特徵。因此，AutoML 可以建立時間延隔和滾動時段彙總功能，因此可能改善模型精確度。

請考慮能源需求的預測案例，當中有天氣資料和歷史需求可用。此資料表顯示最近三小時套用時段彙總時所發生的特徵工程。根據定義設定中的三小時滾動時段，系統產生最小值、最大值和總和資料行。例如，針對 2017 年 9 月 8 日上午 4：00 的觀察有效，會使用 2017 年 9 月 8 日 1：00 - 3：00AM 的需求值來計算最大值、最小值和總和值 。這三小時時段會移位以在剩餘的資料列中填入資料。如需詳細資訊和範例，請參閱 AutoML 中時間序列預測的延遲功能。

您可以藉由設定滾動時段大小，以及您想要建立的延隔順序，來啟用目標的延隔和滾動時段彙總功能。前一個範例中的時段大小為三。您也可以使用 feature_lags 設定來啟用功能的延隔時間。在下列範例中，所有這些設定都會設定為， auto 以指示 AutoML 藉由分析數據的相互關聯結構來自動判斷設定：

Python SDK
Azure CLI

forecasting_job.set_forecast_settings(
    ...,  # Other settings
    target_lags='auto', 
    target_rolling_window_size='auto',
    feature_lags='auto'
)

# Forecasting specific settings
# Auto configure lags and rolling window features
forecasting:
    target_lags: auto
    target_rolling_window_size: auto
    feature_lags: auto
    # Other settings

短序列處理

如果沒有足夠的數據點來執行模型開發的定型和驗證階段，AutoML 會將時間序列視為簡短序列。如需詳細資訊，請參閱定型資料長度需求。

AutoML 有幾個可針對短序列進行的動作。這些動作可使用 short_series_handling_config 設定來設定。預設值是 auto。下表描述了這些設定：

設定	描述	附註
`auto`	短序列處理的預設值。	- 如果所有數列都很短，請填補數據。 - 如果不是所有數列都很短，請卸除簡短的數列。
`pad`	如果使用設定 `short_series_handling_config = pad` ，AutoML 會將隨機值新增至找到的每個簡短數列。 AutoML 會以白雜訊填補目標數據行。	您可以使用下列資料行類型搭配指定的填補： - 對象數據行，填補與 `NaN`s - 數值資料行，填補 0 （零） - 布爾值/邏輯數據行，填補 `False`
`drop`	如果使用設定 `short_series_handling_config = drop` ，AutoML 會卸除簡短的數列，而且不會用於定型或預測。	這些序列的預測會傳回 `NaN`。
`None`	未填補或卸除任何數列。

下列範例會設定簡短數列處理，讓所有短數列填補為最小長度：

Python SDK
Azure CLI

forecasting_job.set_forecast_settings(
    ...,  # Other settings
    short_series_handling_config='pad'
)

# Forecasting specific settings
# Auto configure lags and rolling window features
forecasting:
    short_series_handling_config: pad
    # Other settings

警告

填補可能會影響結果模型的精確度，因為它引進了人工數據，以避免定型失敗。若有許多短序列，可能也會對於可解釋性結果造成一些影響。

頻率和目標資料彙總

使用頻率和資料彙總選項，可避免不規則資料所造成的失敗。不規則的資料包含未遵循一組頻率的資料，例如每小時或每日資料。銷售點資料是很好的不規則資料例子。在這些案例中，AutoML 可以將您的數據匯總為所需的頻率，然後從匯總建置預測模型。

您需要設定 frequency 和 target_aggregate_function 設定來處理不規則的資料。頻率設定接受 Pandas DateOffset 字串作為輸入。下表顯示匯總函式支援的值：

函式	描述
`sum`	目標值的總和
`mean`	目標值的平均值或平均值
`min`	目標的最小值
`max`	目標的最大值

AutoML 會套用下列數據行的匯總：

資料行	匯總方法
數值預測值	AutoML 使用`sum`、 `meanmin`和 `max` 函式。它會產生新的數據行，其中每個數據行名稱都包含後置詞，識別套用至數據行值的聚合函數名稱。
類別預測值	AutoML 會使用參數的值 `forecast_mode` 來匯總數據。這是該時段中最突出的類別。如需詳細資訊，請參閱許多模型管線和 HTS 管線一節中的參數描述。
數據預測器	AutoML 會使用最小目標值（`min`）、最大目標值（`max`）和 `forecast_mode` 參數設定來匯總數據。
Target	AutoML 會根據指定的作業匯總值。一般而言，函 `sum` 式適用於大部分案例。

下列範例會將頻率設定為每小時，並將彙總函數設定為加總：

Python SDK
Azure CLI

# Aggregate the data to hourly frequency
forecasting_job.set_forecast_settings(
    ...,  # Other settings
    frequency='H',
    target_aggregate_function='sum'
)

# Forecasting specific settings
# Auto configure lags and rolling window features
forecasting:
    frequency: H
    target_aggregate_function: sum
    # Other settings

自訂交叉驗證設定

有兩個可自定義的設定可控制預測作業的交叉驗證。使用 n_cross_validations 參數自定義折疊數，並設定 cv_step_size 參數來定義折疊之間的時間位移。如需詳細資訊，請參閱預測模型選取項目。

根據預設，AutoML 會根據資料的特性自動設定這兩個設定。進階使用者可能想手動設定。例如，假設您有每日銷售資料，而且您希望驗證設定包含五個折疊，相鄰折疊之間有七天的位移。下列程式碼範例示範如何設定這些值：

Python SDK
Azure CLI

from azure.ai.ml import automl

# Create a job with five CV folds
forecasting_job = automl.forecasting(
    ...,  # Other training parameters
    n_cross_validations=5,
)

# Set the step size between folds to seven days
forecasting_job.set_forecast_settings(
    ...,  # Other settings
    cv_step_size=7
)

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: cli-v2-automl-forecasting-job
description: A time-series forecasting AutoML job
task: forecasting

training_data:
    path: "./train_data"
    type: mltable

compute: azureml:cpu-compute

primary_metric: normalized_root_mean_squared_error
target_column_name: <target_column_name>
n_cross_validations: auto

# Use five CV folds
n_cross_validations: 5

# Set the step size between folds to seven days
forecasting:
    cv_step_size: 7
    # Other settings

limits:
    # Limit settings

training:
    # Training settings

自訂特徵工程

根據預設，AutoML 會使用工程特徵來增強定型資料，以提高模型的精確度。如需詳細資訊，請參閱自動化特徵工程。您可以使用預測作業的特色化組態來自定義某些前置處理步驟。

下表列出預測支援的自訂專案：

自訂	描述	選項。
資料行用途更新	覆寫所指定資料行的自動偵測特徵類型。	`categorical`、、 `dateTimenumeric`
轉換器參數更新	更新所指定插補程式的參數。	`{"strategy": "constant", "fill_value": <value>}`、、 `{"strategy": "median"}{"strategy": "ffill"}`

例如，假設您有零售需求案例，其中資料包含價格、on sale 旗標和產品類型。下列範例示範如何設定這些功能的自訂類型和插補程式：

Python SDK
Azure CLI

from azure.ai.ml.automl import ColumnTransformer

# Customize imputation methods for price and is_on_sale features
# Median value imputation for price, constant value of zero for is_on_sale
transformer_params = {
    "imputer": [
        ColumnTransformer(fields=["price"], parameters={"strategy": "median"}),
        ColumnTransformer(fields=["is_on_sale"], parameters={"strategy": "constant", "fill_value": 0}),
    ],
}

# Set the featurization
# Ensure product_type feature is interpreted as categorical
forecasting_job.set_featurization(
    mode="custom",
    transformer_params=transformer_params,
    column_name_and_types={"product_type": "Categorical"},
)

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

experiment_name: cli-v2-automl-forecasting-job
description: A time-series forecasting AutoML job
task: forecasting

training_data:
    path: "./train_data"
    type: mltable

compute: azureml:cpu-compute

primary_metric: normalized_root_mean_squared_error
target_column_name: <target_column_name>
n_cross_validations: auto

# Customize imputation methods for price and is_on_sale features
# Median value imputation for price, constant value of zero for is_on_sale
featurization:
    mode: custom
    column_name_and_types:
        product_type: Categorical
    transformer_params:
        imputer:
            - fields: ["price"]
            parameters:
                strategy: median
            - fields: ["is_on_sale"]
            parameters:
                strategy: constant
                fill_value: 0

forecasting:
    # Forecasting specific settings

limits:
    # Limit settings

training:
    # Training settings

如果您使用 Azure Machine Learning 工作室來進行實驗，請參閱在 Studio 中設定特徵化設定。

提交預測工作

設定所有設定之後，您就可以開始執行預測工作。下列範例示範此程式。

Python SDK
Azure CLI

# Submit the AutoML job
returned_job = ml_client.jobs.create_or_update(
    forecasting_job
)

print(f"Created job: {returned_job}")

# Get a URL for the job in the studio UI
returned_job.services["Studio"].endpoint

在下列 Azure CLI 命令中，作業 YAML 組態位於路徑 ./automl-forecasting-job.yml的目前工作目錄中。如果您從不同的目錄執行命令，則必須據以變更路徑。

run_id=$(az ml job create --file automl-forecasting-job.yml)

您可以使用預存的執行識別碼來傳回作業的相關資訊。 --web 參數會開啟 Azure Machine Learning 工作室 Web UI，您可以在其中查看工作的詳細資料：

az ml job show -n $run_id --web

提交作業之後，AutoML 會布建計算資源、將特徵化和其他準備步驟套用至輸入數據，並開始掃掠預測模型。如需詳細資訊，請參閱 AutoML 中的預測方法，以及 AutoML 中預測的模型掃掠和選取專案。

使用元件和管線協調定型、推斷和評估

您的機器學習工作流程可能需要不僅僅是訓練。推斷或擷取較新資料的模型預測，以及評估具有已知目標值之測試集上的模型精確度，是您可以在 Azure Machine Learning 中協調的其他常見工作，以及定型工作。為了支援推斷和評估工作，Azure Machine Learning 提供了元件，其為在 Azure Machine Learning 管線中執行一個步驟的獨立式程式碼片段。

Python SDK
Azure CLI

在下列範例中，從用戶端登錄擷取元件程式碼：

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

# Get credential to access AzureML registry
try:
    credential = DefaultAzureCredential()
    # Check if token can be obtained successfully
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential fails
    credential = InteractiveBrowserCredential()

# Create client to access assets in AzureML preview registry
ml_client_registry = MLClient(
    credential=credential,
    registry_name="azureml-preview"
)

# Create client to access assets in AzureML registry
ml_client_metrics_registry = MLClient(
    credential=credential,
    registry_name="azureml"
)

# Get inference component from registry
inference_component = ml_client_registry.components.get(
    name="automl_forecasting_inference",
    label="latest"
)

# Get component to compute evaluation metrics from registry
compute_metrics_component = ml_client_metrics_registry.components.get(
    name="compute_metrics",
    label="latest"
)

接下來，定義 Factory 函式，以建立管線來協調定型、推斷和計量計算。如需詳細資訊，請參閱設定實驗。

from azure.ai.ml import automl
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.dsl import pipeline

@pipeline(description="AutoML Forecasting Pipeline")
def forecasting_train_and_evaluate_factory(
    train_data_input,
    test_data_input,
    target_column_name,
    time_column_name,
    forecast_horizon,
    primary_metric='normalized_root_mean_squared_error',
    cv_folds='auto'
):
    # Configure training node of pipeline
    training_node = automl.forecasting(
        training_data=train_data_input,
        target_column_name=target_column_name,
        primary_metric=primary_metric,
        n_cross_validations=cv_folds,
        outputs={"best_model": Output(type=AssetTypes.MLFLOW_MODEL)},
    )

    training_node.set_forecasting_settings(
        time_column_name=time_column_name,
        forecast_horizon=max_horizon,
        frequency=frequency,
        # Other settings
        ... 
    )
    
    training_node.set_training(
        # Training parameters
        ...
    )
    
    training_node.set_limits(
        # Limit settings
        ...
    )

    # Configure inference node to make rolling forecasts on test set
    inference_node = inference_component(
        test_data=test_data_input,
        model_path=training_node.outputs.best_model,
        target_column_name=target_column_name,
        forecast_mode='rolling',
        step=1
    )

    # Configure metrics calculation node
    compute_metrics_node = compute_metrics_component(
        task="tabular-forecasting",
        ground_truth=inference_node.outputs.inference_output_file,
        prediction=inference_node.outputs.inference_output_file,
        evaluation_config=inference_node.outputs.evaluation_config_output_file
    )

    # Return dictionary with evaluation metrics and raw test set forecasts
    return {
        "metrics_result": compute_metrics_node.outputs.evaluation_result,
        "rolling_fcst_result": inference_node.outputs.inference_output_file
    }

定義本機資料夾 ./train_data 和 ./test_data中包含的定型和測試數據輸入。

my_train_data_input = Input(
    type=AssetTypes.MLTABLE,
    path="./train_data"
)

my_test_data_input = Input(
    type=AssetTypes.URI_FOLDER,
    path='./test_data',
)

最後，建構管線、設定其預設計算並提交工作：

pipeline_job = forecasting_train_and_evaluate_factory(
    my_train_data_input,
    my_test_data_input,
    target_column_name,
    time_column_name,
    forecast_horizon
)

# Set pipeline level compute
pipeline_job.settings.default_compute = compute_name

# Submit pipeline job
returned_pipeline_job = ml_client.jobs.create_or_update(
    pipeline_job,
    experiment_name=experiment_name
)
returned_pipeline_job

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

description: AutoML Forecasting Pipeline
experiment_name: cli-v2-automl-forecasting-pipeline

# Set default compute for pipeline steps
settings:
    default_compute: cpu-compute

# Pipeline inputs
inputs:
    train_data_input:
        type: mltable
        path: "./train_data"
    test_data_input:
        type: uri_folder
        path: "./test_data"
    target_column_name: <target column name>
    time_column_name: <time column name>
    forecast_horizon: <forecast horizon>
    primary_metric: normalized_root_mean_squared_error
    cv_folds: auto

# Set pipeline outputs
# Output the evaluation metrics and raw test set rolling forecasts
outputs: 
    metrics_result:
        type: uri_file
        mode: upload
    rolling_fcst_result:
        type: uri_file
        mode: upload

jobs:
  # Configure automl training node of pipeline 
    training_node:
        type: automl
        task: forecasting
        primary_metric: ${{parent.inputs.primary_metric}}
        target_column_name: ${{parent.inputs.target_column_name}}
        training_data: ${{parent.inputs.train_data_input}}
        n_cross_validations: ${{parent.inputs.cv_folds}}
        training:
            # Training settings
        forecasting:
            time_column_name: ${{parent.inputs.time_column_name}}
            forecast_horizon: ${{parent.inputs.forecast_horizon}}
            # Other forecasting specific settings
        limits:
            # Limit settings
        outputs:
            best_model:
                type: mlflow_model

    # Configure inference node to make rolling forecasts on test set
    inference_node:
        type: command
        component: azureml://registries/azureml-preview/components/automl_forecasting_inference
        inputs:
            target_column_name: ${{parent.inputs.target_column_name}}
            forecast_mode: rolling
            step: 1
            test_data: ${{parent.inputs.test_data_input}}
            model_path: ${{parent.jobs.training_node.outputs.best_model}}
        outputs:
            inference_output_file: ${{parent.outputs.rolling_fcst_result}}
            evaluation_config_output_file:
                type: uri_file

    # Configure metrics calculation node
    compute_metrics:
        type: command
        component: azureml://registries/azureml/compute_metrics
        inputs:
            task: "tabular-forecasting"
            ground_truth: ${{parent.jobs.inference_node.outputs.inference_output_file}}
            prediction: ${{parent.jobs.inference_node.outputs.inference_output_file}}
            evaluation_config: ${{parent.jobs.inference_node.outputs.evaluation_config_output_file}}
        outputs:
            evaluation_result: ${{parent.outputs.metrics_result}}

AutoML 需要 AutoML 的 MLTable 格式的定型資料。

使用下列命令啟動管線執行。管線設定位於路徑 ./automl-forecasting-pipeline.yml：

run_id=$(az ml job create --file automl-forecasting-pipeline.yml -w <Workspace> -g <Resource Group> --subscription <Subscription>)

提交執行要求之後，管線會依序執行 AutoML 定型、滾動評估推斷和計量計算。您可以在 Studio UI 中監視並檢查執行。執行完成時，您可以將滾動預測和評估計量下載到本機工作目錄：

Python SDK
Azure CLI

# Download metrics JSON
ml_client.jobs.download(returned_pipeline_job.name, download_path=".", output_name='metrics_result')

# Download rolling forecasts
ml_client.jobs.download(returned_pipeline_job.name, download_path=".", output_name='rolling_fcst_result')

az ml job download --name $run_id --download-path . --output-name metrics_result
az ml job download --name $run_id --download-path . --output-name rolling_fcst_result

您可以在下列位置檢閱輸出：

計量： ./named-outputs/metrics_results/evaluationResult/metrics.json
預測： ./named-outputs/rolling_fcst_result/inference_output_file （JSON 行格式）

如需滾動評估的詳細資訊，請參閱預測模型的推斷和評估。

大規模預測：許多模型

AutoML 中的許多模型元件可讓您平行定型及管理數百萬個模型。如需許多模型概念的詳細資訊，請參閱許多模型。

許多模型定型設定

許多模型定型元件接受 AutoML 定型設定的 YAML 格式設定檔。元件會將這些設定套用至其啟動的每個 AutoML 執行個體。 YAML 檔案的規格與 Forecasting 命令作業和 partition_column_names 參 allow_multi_partitions 數相同。

參數	描述
`partition_column_names`	分組時，資料中的資料行名稱會定義資料分割。許多模型定型元件會在每個分割區上啟動獨立的定型作業。
`allow_multi_partitions`	選擇性旗標，可在每個分割區包含一個以上的唯一時間序列時，為每個分割區定型一個模型。預設值是 `false`。

下列範例提供範例 YAML 組態：

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

description: A time-series forecasting job config
compute: azureml:<cluster-name>
task: forecasting
primary_metric: normalized_root_mean_squared_error
target_column_name: sales
n_cross_validations: 3

forecasting:
  time_column_name: date
  time_series_id_column_names: ["state", "store"]
  forecast_horizon: 28

training:
  blocked_training_algorithms: ["ExtremeRandomTrees"]

limits:
  timeout_minutes: 15
  max_trials: 10
  max_concurrent_trials: 4
  max_cores_per_trial: -1
  trial_timeout_minutes: 15
  enable_early_termination: true
  
partition_column_names: ["state", "store"]
allow_multi_partitions: false

在後續範例中，設定會儲存在路徑 ./automl_settings_mm.yml`。

許多模型管線

接下來，定義 Factory 函式，以建立管線來協調許多模型定型、推斷和計量計算。下表描述此 Factory 函式的參數：

參數	描述
`max_nodes`	定型作業中使用的計算節點數目。
`max_concurrency_per_node`	要在每個節點上執行的 AutoML 處理序數目。因此，許多模型作業的並行總數為 `max_nodes * max_concurrency_per_node`。
`parallel_step_timeout_in_seconds`	許多模型元件逾時，以秒數指定。
`retrain_failed_models`	用來為失敗的模型啟用重新定型的旗標。如果您先前執行了許多模型，導致某些資料分割的 AutoML 工作失敗，則此值會很有用。當您啟用此旗標時，許多模型只會針對先前失敗的數據分割啟動定型作業。
`forecast_mode`	模型評估的推斷模式。有效值為 `recursive` (預設值) 和 `rolling`。如需詳細資訊，請參閱預測模型的推斷和評估，以及 ManyModelsInferenceParameters 類別參考。
`step`	滾動預測的步驟大小（預設值為1）。如需詳細資訊，請參閱預測模型的推斷和評估，以及 ManyModelsInferenceParameters 類別參考。

下列範例示範建構許多模型定型和模型評估管線的 Factory 方法：

Python SDK
Azure CLI

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

# Get credential to access AzureML registry
try:
    credential = DefaultAzureCredential()
    # Check if token can be obtained successfully
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential fails
    credential = InteractiveBrowserCredential()

# Get many models training component
mm_train_component = ml_client_registry.components.get(
    name='automl_many_models_training',
    version='latest'
)

# Get many models inference component
mm_inference_component = ml_client_registry.components.get(
    name='automl_many_models_inference',
    version='latest'
)

# Get component to compute evaluation metrics
compute_metrics_component = ml_client_metrics_registry.components.get(
    name="compute_metrics",
    label="latest"
)

@pipeline(description="AutoML Many Models Forecasting Pipeline")
def many_models_train_evaluate_factory(
    train_data_input,
    test_data_input,
    automl_config_input,
    compute_name,
    max_concurrency_per_node=4,
    parallel_step_timeout_in_seconds=3700,
    max_nodes=4,
    retrain_failed_model=False,
    forecast_mode="rolling",
    forecast_step=1
):
    mm_train_node = mm_train_component(
        raw_data=train_data_input,
        automl_config=automl_config_input,
        max_nodes=max_nodes,
        max_concurrency_per_node=max_concurrency_per_node,
        parallel_step_timeout_in_seconds=parallel_step_timeout_in_seconds,
        retrain_failed_model=retrain_failed_model,
        compute_name=compute_name
    )

    mm_inference_node = mm_inference_component(
        raw_data=test_data_input,
        max_nodes=max_nodes,
        max_concurrency_per_node=max_concurrency_per_node,
        parallel_step_timeout_in_seconds=parallel_step_timeout_in_seconds,
        optional_train_metadata=mm_train_node.outputs.run_output,
        forecast_mode=forecast_mode,
        step=forecast_step,
        compute_name=compute_name
    )

    compute_metrics_node = compute_metrics_component(
        task="tabular-forecasting",
        prediction=mm_inference_node.outputs.evaluation_data,
        ground_truth=mm_inference_node.outputs.evaluation_data,
        evaluation_config=mm_inference_node.outputs.evaluation_configs
    )

    # Return metrics results from rolling evaluation
    return {
        "metrics_result": compute_metrics_node.outputs.evaluation_result
    }

使用 Factory 函式建構管線。定型和測試數據分別位於本機資料夾中 ./data/train 和 ./data/test。最後，設定預設計算並提交作業，如下列範例所示：

pipeline_job = many_models_train_evaluate_factory(
    train_data_input=Input(
        type="uri_folder",
        path="./data/train"
    ),
    test_data_input=Input(
        type="uri_folder",
        path="./data/test"
    ),
    automl_config=Input(
        type="uri_file",
        path="./automl_settings_mm.yml"
    ),
    compute_name="<cluster name>"
)
pipeline_job.settings.default_compute = "<cluster name>"

returned_pipeline_job = ml_client.jobs.create_or_update(
    pipeline_job,
    experiment_name=experiment_name,
)
ml_client.jobs.stream(returned_pipeline_job.name)

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

description: AutoML Many Models Forecasting Pipeline
experiment_name: cli-v2-automl-mm-forecasting-pipeline

# Set default compute for pipeline steps
settings:
    default_compute: azureml:cpu-compute

# Set pipeline inputs
inputs:
    train_data_input:
        type: uri_folder
        path: "./train_data"
        mode: direct
    test_data_input:
        type: uri_folder
        path: "./test_data"
    automl_config_input:
        type: uri_file
        path: "./automl_settings_mm.yml"
    max_nodes: 4
    max_concurrency_per_node: 4
    parallel_step_timeout_in_seconds: 3700
    forecast_mode: rolling
    step: 1
    retrain_failed_model: False

# Set pipeline outputs
# Output the evaluation metrics and raw test set rolling forecasts
outputs: 
    metrics_result:
        type: uri_file
        mode: upload

jobs:
    # Configure AutoML many models training component
    mm_train_node:
        type: command
        component: azureml://registries/azureml-preview/components/automl_many_models_training
        inputs:
            raw_data: ${{parent.inputs.train_data_input}}
            automl_config: ${{parent.inputs.automl_config_input}}
            max_nodes: ${{parent.inputs.max_nodes}}
            max_concurrency_per_node: ${{parent.inputs.max_concurrency_per_node}}
            parallel_step_timeout_in_seconds: ${{parent.inputs.parallel_step_timeout_in_seconds}}
            retrain_failed_model: ${{parent.inputs.retrain_failed_model}}
        outputs:
            run_output:
                type: uri_folder

    # Configure inference node to make rolling forecasts on test set
    mm_inference_node:
        type: command
        component: azureml://registries/azureml-preview/components/automl_many_models_inference
        inputs:
            raw_data: ${{parent.inputs.test_data_input}}
            max_concurrency_per_node: ${{parent.inputs.max_concurrency_per_node}}
            parallel_step_timeout_in_seconds: ${{parent.inputs.parallel_step_timeout_in_seconds}}
            forecast_mode: ${{parent.inputs.forecast_mode}}
            step: ${{parent.inputs.step}}
            max_nodes: ${{parent.inputs.max_nodes}}
            optional_train_metadata: ${{parent.jobs.mm_train_node.outputs.run_output}}
        outputs:
            run_output:
                type: uri_folder
            evaluation_configs:
                type: uri_file
            evaluation_data:
                type: uri_file

    # Configure metrics calculation node
    compute_metrics:
        type: command
        component: azureml://registries/azureml/components/compute_metrics
        inputs:
            task: "tabular-forecasting"
            ground_truth: ${{parent.jobs.mm_inference_node.outputs.evaluation_data}}
            prediction: ${{parent.jobs.mm_inference_node.outputs.evaluation_data}}
            evaluation_config: ${{parent.jobs.mm_inference_node.outputs.evaluation_configs}}
        outputs:
            evaluation_result: ${{parent.outputs.metrics_result}}

您會使用下列命令啟動管線工作。許多模型管線設定位於路徑 ./automl-mm-forecasting-pipeline.yml：

az ml job create --file automl-mm-forecasting-pipeline.yml -w <Workspace> -g <Resource Group> --subscription <Subscription>

作業完成之後，您可以使用與在單一定型執行管線中相同的程式，在本機下載評估計量。

如需更詳細的範例，請參閱具有許多模型筆記本的需求預測。

許多模型執行的定型考慮

許多模型定型和推斷元件會根據設定有條件地分割您的數據， partition_column_names 讓每個分割區都位於自己的檔案中。當資料非常大時，此程序可能會非常緩慢或失敗。建議您在執行許多模型定型或推斷之前，手動分割您的數據。

注意

訂用帳戶內執行之許多模型的預設平行處理原則限制會設定為 320。如果您的工作負載需要較高的限制，您可以連絡Microsoft支援。

大規模預測：階層式時間序列

AutoML 中的階層式時間序列 (HTS) 元件可讓您在階層式結構的資料上定型大量模型。如需詳細資訊，請參閱階層式時間序列預測。

HTS 定型設定

HTS 定型元件接受 AutoML 定型設定的 YAML 格式設定檔。元件會將這些設定套用至其啟動的每個 AutoML 執行個體。此 YAML 檔案的規格與 Forecasting 命令作業相同，以及與階層資訊相關的其他參數：

參數	描述
`hierarchy_column_names`	資料中的資料行名稱清單，定義資料的階層式結構。此清單中資料行的順序會決定階層層級。彙總的程度會隨著清單索引而減少。也就是說，清單中的最後一個資料行將定義階層的分葉 (或稱最分類式) 層級。
`hierarchy_training_level`	用於預測模型定型的階層層級。

下列範例提供範例 YAML 組態：

$schema: https://azuremlsdk2.blob.core.windows.net/preview/0.0.1/autoMLJob.schema.json
type: automl

description: A time-series forecasting job config
compute: azureml:cluster-name
task: forecasting
primary_metric: normalized_root_mean_squared_error
log_verbosity: info
target_column_name: sales
n_cross_validations: 3

forecasting:
  time_column_name: "date"
  time_series_id_column_names: ["state", "store", "SKU"]
  forecast_horizon: 28

training:
  blocked_training_algorithms: ["ExtremeRandomTrees"]

limits:
  timeout_minutes: 15
  max_trials: 10
  max_concurrent_trials: 4
  max_cores_per_trial: -1
  trial_timeout_minutes: 15
  enable_early_termination: true
  
hierarchy_column_names: ["state", "store", "SKU"]
hierarchy_training_level: "store"

在後續範例中，設定會儲存在路徑 ./automl_settings_hts.yml。

HTS 管線

接下來，定義 Factory 函式，以建立管線來協調 HTS 定型、推斷和計量計算。下表描述此 Factory 函式的參數：

參數	描述
`forecast_level`	要擷取預測之階層的層級。
`allocation_method`	當預測為分類式時要使用的配置方法。有效值為 `proportions_of_historical_average` 和 `average_historical_proportions`。
`max_nodes`	定型作業中使用的計算節點數目。
`max_concurrency_per_node`	要在每個節點上執行的 AutoML 處理序數目。因此，HTS 作業的並行總數為 `max_nodes * max_concurrency_per_node`。
`parallel_step_timeout_in_seconds`	許多模型元件逾時，以秒數指定。
`forecast_mode`	模型評估的推斷模式。有效值為 `recursive` 和 `rolling`。如需詳細資訊，請參閱預測模型的推斷和評估和 HTSInferenceParameters 類別參考。
`step`	滾動預測的步驟大小（預設值為1）。如需詳細資訊，請參閱預測模型的推斷和評估和 HTSInferenceParameters 類別參考。

Python SDK
Azure CLI

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential, InteractiveBrowserCredential

# Get credential to access AzureML registry
try:
    credential = DefaultAzureCredential()
    # Check if token can be obtained successfully
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    # Fall back to InteractiveBrowserCredential in case DefaultAzureCredential fails
    credential = InteractiveBrowserCredential()

# Get HTS training component
hts_train_component = ml_client_registry.components.get(
    name='automl_hts_training',
    version='latest'
)

# Get HTS inference component
hts_inference_component = ml_client_registry.components.get(
    name='automl_hts_inference',
    version='latest'
)

# Get component to compute evaluation metrics
compute_metrics_component = ml_client_metrics_registry.components.get(
    name="compute_metrics",
    label="latest"
)

@pipeline(description="AutoML HTS Forecasting Pipeline")
def hts_train_evaluate_factory(
    train_data_input,
    test_data_input,
    automl_config_input,
    max_concurrency_per_node=4,
    parallel_step_timeout_in_seconds=3700,
    max_nodes=4,
    forecast_mode="rolling",
    forecast_step=1,
    forecast_level="SKU",
    allocation_method='proportions_of_historical_average'
):
    hts_train = hts_train_component(
        raw_data=train_data_input,
        automl_config=automl_config_input,
        max_concurrency_per_node=max_concurrency_per_node,
        parallel_step_timeout_in_seconds=parallel_step_timeout_in_seconds,
        max_nodes=max_nodes
    )
    hts_inference = hts_inference_component(
        raw_data=test_data_input,
        max_nodes=max_nodes,
        max_concurrency_per_node=max_concurrency_per_node,
        parallel_step_timeout_in_seconds=parallel_step_timeout_in_seconds,
        optional_train_metadata=hts_train.outputs.run_output,
        forecast_level=forecast_level,
        allocation_method=allocation_method,
        forecast_mode=forecast_mode,
        step=forecast_step
    )
    compute_metrics_node = compute_metrics_component(
        task="tabular-forecasting",
        prediction=hts_inference.outputs.evaluation_data,
        ground_truth=hts_inference.outputs.evaluation_data,
        evaluation_config=hts_inference.outputs.evaluation_configs
    )

    # Return metrics results from rolling evaluation
    return {
        "metrics_result": compute_metrics_node.outputs.evaluation_result
    }

使用 Factory 函式建構管線。定型和測試數據分別位於本機資料夾中 ./data/train 和 ./data/test。最後，設定預設計算並提交作業，如下列範例所示：

pipeline_job = hts_train_evaluate_factory(
    train_data_input=Input(
        type="uri_folder",
        path="./data/train"
    ),
    test_data_input=Input(
        type="uri_folder",
        path="./data/test"
    ),
    automl_config=Input(
        type="uri_file",
        path="./automl_settings_hts.yml"
    )
)
pipeline_job.settings.default_compute = "cluster-name"

returned_pipeline_job = ml_client.jobs.create_or_update(
    pipeline_job,
    experiment_name=experiment_name,
)
ml_client.jobs.stream(returned_pipeline_job.name)

$schema: https://azuremlschemas.azureedge.net/latest/pipelineJob.schema.json
type: pipeline

description: AutoML Many Models Forecasting Pipeline
experiment_name: cli-v2-automl-mm-forecasting-pipeline

# Set the default compute for pipeline steps
settings:
    default_compute: cpu-compute

# Set pipeline inputs
inputs:
    train_data_input:
        type: uri_folder
        path: "./train_data"
        mode: direct
    test_data_input:
        type: uri_folder
        path: "./test_data"
    automl_config_input:
        type: uri_file
        path: "./automl_settings_hts.yml"
    max_concurrency_per_node: 4
    parallel_step_timeout_in_seconds: 3700
    max_nodes: 4
    forecast_mode: rolling
    step: 1
    allocation_method: proportions_of_historical_average
    forecast_level: # forecast level

# Set pipeline outputs
# Output evaluation metrics and raw test set rolling forecasts
outputs: 
    metrics_result:
        type: uri_file
        mode: upload

jobs:
    # Configure AutoML many models training component
    hts_train_node:
        type: command
        component: azureml://registries/azureml-preview/components/automl_hts_training
        inputs:
            raw_data: ${{parent.inputs.train_data_input}}
            automl_config: ${{parent.inputs.automl_config_input}}
            max_nodes: ${{parent.inputs.max_nodes}}
            max_concurrency_per_node: ${{parent.inputs.max_concurrency_per_node}}
            parallel_step_timeout_in_seconds: ${{parent.inputs.parallel_step_timeout_in_seconds}}
        outputs:
            run_output:
                type: uri_folder


    # Configure inference node to make rolling forecasts on test set
    hts_inference_node:
        type: command
        component: azureml://registries/azureml-preview/components/automl_hts_inference
        inputs:
            raw_data: ${{parent.inputs.test_data_input}}
            max_concurrency_per_node: ${{parent.inputs.max_concurrency_per_node}}
            parallel_step_timeout_in_seconds: ${{parent.inputs.parallel_step_timeout_in_seconds}}
            forecast_mode: ${{parent.inputs.forecast_mode}}
            step: ${{parent.inputs.step}}
            max_nodes: ${{parent.inputs.max_nodes}}
            optional_train_metadata: ${{parent.jobs.hts_train_node.outputs.run_output}}
            forecast_level: ${{parent.inputs.forecast_level}}
            allocation_method: ${{parent.inputs.allocation_method}}
        outputs:
            run_output:
                type: uri_folder
            evaluation_configs:
                type: uri_file
            evaluation_data:
                type: uri_file

    # Configure metrics calculation node
    compute_metrics:
        type: command
        component: azureml://registries/azureml/components/compute_metrics
        inputs:
            task: "tabular-forecasting"
            ground_truth: ${{parent.jobs.hts_inference_node.outputs.evaluation_data}}
            prediction: ${{parent.jobs.hts_inference_node.outputs.evaluation_data}}
            evaluation_config: ${{parent.jobs.hts_inference_node.outputs.evaluation_configs}}
        outputs:
            evaluation_result: ${{parent.outputs.metrics_result}}

您會使用下列命令啟動管線工作。許多模型管線設定位於路徑 ./automl-hts-forecasting-pipeline.yml：

az ml job create --file automl-hts-forecasting-pipeline.yml -w <Workspace> -g <Resource Group> --subscription <Subscription>

作業完成之後，您可以使用與單一定型執行管線相同的程式，在本機下載評估計量。

如需更詳細的範例，請參閱階層式時間序列需求預測筆記本。

HTS 執行的訓練考慮

HTS 定型和推斷元件會根據設定有條件地分割您的數據， hierarchy_column_names 讓每個分割區都位於自己的檔案中。當資料非常大時，此程序可能會非常緩慢或失敗。建議的方法是在執行 HTS 定型或推斷之前，手動分割您的數據。

注意

訂用帳戶內執行 HTS 的預設平行處理原則限制會設定為 320。如果您的工作負載需要較高的限制，您可以連絡Microsoft支援。

大規模預測：分散式 DNN 定型

如本文稍早所述，您可以啟用深度學習神經網路（DNN）的學習。若要了解分散式定型如何針對 DNN 預測工作運作，請參閱分散式深度神經網路定型（預覽版）。

針對具有大型數據需求的案例，使用 AutoML 的分散式定型適用於一組有限的模型。您可以在 AutoML 中大規模找到詳細資訊和程式代碼範例：分散式定型。

探索筆記本範例

示範進階預測組態的詳細程式碼範例可在 AutoML 預測範例筆記本 GitHub 存放庫中取得。以下是一些範例筆記本：

共用方式為

設定 AutoML 以使用 SDK 和 CLI 定型時間序列預測模型

必要條件

準備定型和驗證數據

建立計算以執行實驗

設定實驗

預測工作設定

選擇性預測作業設定

模型搜尋設定

啟用深度學習神經網路的學習

延隔和滾動時段功能

短序列處理

頻率和目標資料彙總

自訂交叉驗證設定

自訂特徵工程

提交預測工作

使用元件和管線協調定型、推斷和評估

大規模預測：許多模型

許多模型定型設定

許多模型管線

許多模型執行的定型考慮

大規模預測：階層式時間序列

HTS 定型設定

HTS 管線

HTS 執行的訓練考慮

大規模預測：分散式 DNN 定型

探索筆記本範例

意見反應

其他資源

共用方式為

設定 AutoML 以使用 SDK 和 CLI 定型時間序列預測模型

必要條件

準備定型和驗證數據

建立計算以執行實驗

設定實驗

預測工作設定

選擇性預測作業設定

模型搜尋設定

啟用深度學習神經網路的學習

延隔和滾動時段功能

短序列處理

頻率和目標資料彙總

自訂交叉驗證設定

自訂特徵工程

提交預測工作

使用元件和管線協調定型、推斷和評估

大規模預測：許多模型

許多模型定型設定

許多模型管線

許多模型執行的定型考慮

大規模預測：階層式時間序列

HTS 定型設定

HTS 管線

HTS 執行的訓練考慮

大規模預測：分散式 DNN 定型

探索筆記本範例

相關內容

意見反應

其他資源