ForecastingPipelineWrapperBase Class
Base class for forecast model wrapper.
- Inheritance
-
ForecastingPipelineWrapperBase
Constructor
ForecastingPipelineWrapperBase(ts_transformer: TimeSeriesTransformer | None = None, y_transformer: Pipeline | None = None, metadata: Dict[str, Any] | None = None)
Parameters
Name | Description |
---|---|
ts_transformer
|
Default value: None
|
y_transformer
|
Default value: None
|
metadata
|
Default value: None
|
Methods
align_output_to_input |
Align the transformed output data frame to the input data frame. Note: transformed will be modified by reference, no copy is being created. :param X_input: The input data frame. :param transformed: The data frame after transformation. :returns: The transfotmed data frame with its original index, but sorted as in X_input. |
fit |
Fit the model with input X and y. |
forecast |
Do the forecast on the data frame X_pred. |
forecast_quantiles |
Get the prediction and quantiles from the fitted pipeline. |
is_grain_dropped |
Return true if the grain is going to be dropped. |
preaggregate_data_set |
Aggregate the prediction data set. Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required. |
preprocess_pred_X_y |
Preprocess prediction X and y. |
rolling_evaluation |
" Produce forecasts on a rolling origin over the given test set. Each iteration makes a forecast for the next 'max_horizon' periods with respect to the current origin, then advances the origin by the horizon time duration. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lag features. This function returns a concatenated DataFrame of rolling forecasts joined with the actuals from the test set. This method is deprecated and will be removed in a future release. Please use rolling_forecast() instead. |
rolling_forecast |
Produce forecasts on a rolling origin over a test set. Each iteration makes a forecast of maximum horizon periods ahead using information up to the current origin, then advances the origin by 'step' time periods. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lookback features. This function returns a DataFrame of rolling forecasts joined with the actuals from the test set. The columns in the returned data frame are as follows:
|
short_grain_handling |
Return true if short or absent grains handling is enabled for the model. |
static_preaggregate_data_set |
Aggregate the prediction data set. Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param ts_transformer: The timeseries tranformer used for training. :param time_column_name: name of the time column. :param grain_column_names: List of grain column names. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required. |
align_output_to_input
Align the transformed output data frame to the input data frame.
Note: transformed will be modified by reference, no copy is being created. :param X_input: The input data frame. :param transformed: The data frame after transformation. :returns: The transfotmed data frame with its original index, but sorted as in X_input.
align_output_to_input(X_input: DataFrame, transformed: DataFrame) -> DataFrame
Parameters
Name | Description |
---|---|
X_input
Required
|
|
transformed
Required
|
|
fit
Fit the model with input X and y.
fit(X: DataFrame, y: ndarray) -> ForecastingPipelineWrapperBase
Parameters
Name | Description |
---|---|
X
Required
|
Input X data. |
y
Required
|
Input y data. |
forecast
Do the forecast on the data frame X_pred.
forecast(X_pred: DataFrame | None = None, y_pred: ndarray | DataFrame | None = None, forecast_destination: Timestamp | None = None, ignore_data_errors: bool = False) -> Tuple[ndarray, DataFrame]
Parameters
Name | Description |
---|---|
X_pred
|
the prediction dataframe combining X_past and X_future in a time-contiguous manner. Empty values in X_pred will be imputed. Default value: None
|
y_pred
|
the target value combining definite values for y_past and missing values for Y_future. If None the predictions will be made for every X_pred. Default value: None
|
forecast_destination
|
<xref:pandas.Timestamp>
Forecast_destination: a time-stamp value. Forecasts will be made all the way to the forecast_destination time, for all grains. Dictionary input { grain -> timestamp } will not be accepted. If forecast_destination is not given, it will be imputed as the last time occurring in X_pred for every grain. Default value: None
|
ignore_data_errors
|
Ignore errors in user data. Default value: False
|
Returns
Type | Description |
---|---|
Y_pred, with the subframe corresponding to Y_future filled in with the respective forecasts. Any missing values in Y_past will be filled by imputer. |
forecast_quantiles
Get the prediction and quantiles from the fitted pipeline.
forecast_quantiles(X_pred: DataFrame | None = None, y_pred: ndarray | DataFrame | None = None, quantiles: float | List[float] | None = None, forecast_destination: Timestamp | None = None, ignore_data_errors: bool = False) -> DataFrame
Parameters
Name | Description |
---|---|
X_pred
|
the prediction dataframe combining X_past and X_future in a time-contiguous manner. Empty values in X_pred will be imputed. Default value: None
|
y_pred
|
the target value combining definite values for y_past and missing values for Y_future. If None the predictions will be made for every X_pred. Default value: None
|
quantiles
|
The list of quantiles at which we want to forecast. Default value: None
|
forecast_destination
|
<xref:pandas.Timestamp>
Forecast_destination: a time-stamp value. Forecasts will be made all the way to the forecast_destination time, for all grains. Dictionary input { grain -> timestamp } will not be accepted. If forecast_destination is not given, it will be imputed as the last time occurring in X_pred for every grain. Default value: None
|
ignore_data_errors
|
Ignore errors in user data. Default value: False
|
Returns
Type | Description |
---|---|
A dataframe containing the columns and predictions made at requested quantiles. |
is_grain_dropped
Return true if the grain is going to be dropped.
is_grain_dropped(grain: Tuple[str] | str | List[str]) -> bool
Parameters
Name | Description |
---|---|
grain
Required
|
The grain to test if it will be dropped. |
Returns
Type | Description |
---|---|
True if the grain will be dropped. |
preaggregate_data_set
Aggregate the prediction data set.
Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required.
preaggregate_data_set(df: DataFrame, y: ndarray | None = None, is_training_set: bool = False) -> Tuple[DataFrame, ndarray | None]
Parameters
Name | Description |
---|---|
df
Required
|
|
y
|
Default value: None
|
is_training_set
|
Default value: False
|
preprocess_pred_X_y
Preprocess prediction X and y.
preprocess_pred_X_y(X_pred: DataFrame | None = None, y_pred: ndarray | DataFrame | None = None, forecast_destination: Timestamp | None = None) -> Tuple[DataFrame, DataFrame | ndarray, Dict[str, Any]]
Parameters
Name | Description |
---|---|
X_pred
|
Default value: None
|
y_pred
|
Default value: None
|
forecast_destination
|
Default value: None
|
rolling_evaluation
" Produce forecasts on a rolling origin over the given test set.
Each iteration makes a forecast for the next 'max_horizon' periods with respect to the current origin, then advances the origin by the horizon time duration. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lag features.
This function returns a concatenated DataFrame of rolling forecasts joined with the actuals from the test set.
This method is deprecated and will be removed in a future release. Please use rolling_forecast() instead.
rolling_evaluation(X_pred: DataFrame, y_pred: DataFrame | ndarray, ignore_data_errors: bool = False) -> Tuple[ndarray, DataFrame]
Parameters
Name | Description |
---|---|
X_pred
Required
|
the prediction dataframe combining X_past and X_future in a time-contiguous manner. Empty values in X_pred will be imputed. |
y_pred
Required
|
the target value corresponding to X_pred. |
ignore_data_errors
|
Ignore errors in user data. Default value: False
|
Returns
Type | Description |
---|---|
Y_pred, with the subframe corresponding to Y_future filled in with the respective forecasts. Any missing values in Y_past will be filled by imputer. |
rolling_forecast
Produce forecasts on a rolling origin over a test set.
Each iteration makes a forecast of maximum horizon periods ahead using information up to the current origin, then advances the origin by 'step' time periods. The prediction context for each forecast is set so that the forecaster uses the actual target values prior to the current origin time for constructing lookback features.
This function returns a DataFrame of rolling forecasts joined with the actuals from the test set. The columns in the returned data frame are as follows:
Timeseries ID columns (Optional). When supplied by the user, the given column names will be used.
Forecast origin column giving the origin time for each row.
Column name: stored as the object member variable forecast_origin_column_name.
Time column. The column name given by the user will be used.
Forecast values column. Column name: stored as the object member forecast_column_name
Actual values column. Column name: stored as the object member actual_column_name
rolling_forecast(X_pred: DataFrame, y_pred: ndarray, step: int = 1, ignore_data_errors: bool = False) -> DataFrame
Parameters
Name | Description |
---|---|
X_pred
Required
|
<xref:pd.DataFrame>
Prediction data frame |
y_pred
Required
|
<xref:np.ndarray>
target values corresponding to rows in X_pred |
step
|
Number of periods to advance the forecasting window in each iteration. Default value: 1
|
ignore_data_errors
|
Ignore errors in user data. Default value: False
|
Returns
Type | Description |
---|---|
<xref:pd.DataFrame>
|
Data frame of rolling forecasts |
short_grain_handling
Return true if short or absent grains handling is enabled for the model.
short_grain_handling() -> bool
static_preaggregate_data_set
Aggregate the prediction data set.
Note: This method does not guarantee that the data set will be aggregated. This will happen only if the data set contains the duplicated time stamps or out of grid dates. :param ts_transformer: The timeseries tranformer used for training. :param time_column_name: name of the time column. :param grain_column_names: List of grain column names. :param df: The data set to be aggregated. :patam y: The target values. :param is_training_set: If true, the data represent training set. :return: The aggregated or intact data set if no aggregation is required.
static static_preaggregate_data_set(ts_transformer: TimeSeriesTransformer, time_column_name: str, grain_column_names: List[str], df: DataFrame, y: ndarray | None = None, is_training_set: bool = False) -> Tuple[DataFrame, ndarray | None]
Parameters
Name | Description |
---|---|
ts_transformer
Required
|
|
time_column_name
Required
|
|
grain_column_names
Required
|
|
df
Required
|
|
y
|
Default value: None
|
is_training_set
|
Default value: False
|
Attributes
actual_column_name
forecast_column_name
forecast_origin_column_name
grain_column_list
max_horizon
Return max hiorizon used in the model.
origin_col_name
Return the origin column name.
target_lags
Return target lags if any.
target_rolling_window_size
Return the size of rolling window.
time_column_name
Return the name of the time column.
user_target_column_name
y_max_dict
Return the dictionary with maximal target values by time series ID
y_min_dict
Return the dictionary with minimal target values by time series ID
FATAL_NO_TARGET_IMPUTER
FATAL_NO_TARGET_IMPUTER = 'No target imputers were found in TimeSeriesTransformer.'
FATAL_NO_TS_TRANSFORM
FATAL_NO_TS_TRANSFORM = 'The time series transform is absent. Please try training model again.'