Azure ML / AutoML: problem with univariate time series forecasting

I'm having troubles generating univariate time series forecasts with Azure Automated Machine Learning (I know...).

What I'm doing

So I have about 5 years worth of monthly observations in a dataframe that looks like this:

date target_value
2015-02-01 123
2015-03-01 456
2015-04-01 789
... ...

I want to forecast target_value based on past values of target_value, i.e. univariate forecasting like ARIMA for instance.

So I am setting up the AutoML forecast like this:

# that's the dataframe as shown above
train_data = Dataset.Tabular.from_delimited_files(path=datastore.path(my_remote_filename))

# ...other code...

forecasting_parameters = ForecastingParameters(
    time_column_name='date',
    forecast_horizon=2,
    target_lags='auto',
    freq='MS'
)

automl_config = AutoMLConfig(task='forecasting',
                             debug_log='automl_forecasting_function.log',
                             primary_metric='normalized_root_mean_squared_error',
                             enable_dnn=True,
                             experiment_timeout_hours=8.0,
                             enable_early_stopping=True,
                             training_data=train_data,
                             compute_target='my-cluster',
                             n_cross_validations=3,
                             verbosity=logging.INFO,
                             max_concurrent_iterations=4,
                             max_cores_per_iteration=-1,
                             label_column_name='target_value',
                             forecasting_parameters=forecasting_parameters)

What the problem is

But AutoML does not seem to generate the forecast for target_value based on past values of target_value. It seems to use the date column as the independent variable! The feature importance chart also shows date as the input feature:

As a side note: running multivariate forecasts works fine.

When I use a dataset like this, feature_1 and feature_2 are used (i.e. as the X) to forecast target_value (i.e. the y)

date feature_1 feature_2 target_value
2015-02-01 10 7 123
2015-03-01 30 2 456
2015-04-01 20 5 789
... ... ... ...

My questions therefore

How do I need to set up a univariate AutoML forecast to forecast target_value based on past observations target_value?

I assumed generating lagged values for target_value etc. is exactly what AutoML is supposed to do.

Thanks!

Topic automl forecasting azure-ml

Category Data Science


Here is an exemple for forecasting. I think you should specify the models :

automl_config = AutoMLConfig(
    task="forecasting",
    primary_metric="normalized_root_mean_squared_error",
    blocked_models=["ExtremeRandomTrees", "AutoArima", "Prophet"],
    experiment_timeout_hours=0.3,
    training_data=train,
    label_column_name=target_column_name,
    compute_target=compute_target,
    enable_early_stopping=True,
    n_cross_validations=3,
    verbosity=logging.INFO,
    forecasting_parameters=forecasting_parameters,
)

I have encountered the same problem as you on Azure ML... that's why I have decided to use the SmartPredict platform.

The difference is that we are more flexible in terms of modules and custom modules, our modules have more parameters, and we take a use case approach. In addition, we also have Autoflow, which allows us to automatically generate a flowchart. And in terms of IT resources, we can also choose the size and type of resources.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.