In this document the utsf package is described. This package offers a meta engine for applying different regression models for univariate time series forecasting using an autoregressive approach.
An univariate time series forecasting method is one in which the future values of a series are predicted using only information from the series. For example, the future values of a series can be forecast by its mean historical value. An advantage of this type of prediction is that, apart from the series being forecast, there is no need to collect any further information in order to train the model used to predict the future values of the series.
An autoregressive model is a kind of univariate time series forecasting model in which a value of a time series is expressed as a function of some of its past values. That is, an autoregressive model is a regression model in which the independent variables are lagged values (previous values) of the response variable. For example, given a time series with the following historical values: \(t = \{1, 3, 6, 7, 9, 11, 16\}\), suppose that we want to develop an autoregressive model in which a target “is explained” by its first, second and fourth past values (in this context, a previous value is also called a lag, so lag 1 is the value immediately preceding a given value in the series). Given this series and lags (1, 2 and 4), the training set would be:
Lag 4 | Lag 2 | Lag 1 | Target |
---|---|---|---|
1 | 6 | 7 | 9 |
3 | 7 | 9 | 11 |
6 | 9 | 11 | 16 |
In this model the next future value of the series is predicted as \(f(Lag4, Lag2, Lag1)\), where \(f\) is the regression function and \(Lag4\), \(Lag2\) and \(Lag1\) are the fourth, second and first lagged values of the next future value. So, the next future value of series \(t\) is predicted as \(f(7, 11, 16)\), producing a value that will be called \(F1\).
Suppose that the forecast horizon (the number of future values to be forecast into the future) is greater than 1. In the case that the regression function only predicts the next future value of the series, a recursive approach can be applied to forecast all the future values of the forecast horizon. Using a recursive approach, the regression function is applied recursively until all horizons are forecast. For instance, following the previous example, suppose that the forecast horizon is 3. As we have explained, to forecast the next future value of the series (horizon 1) the regression function is fed with the vector \([7, 11, 16]\), producing \(F1\). To forecast horizon 2 the regression function is fed with the vector \([9, 16, F1]\). The forecast for horizon 1, \(F1\), is used as the first lag for horizon 2 because the actual value is unknown. Finally, to predict horizon 3 the regression function is fed with the vector [\(11, F1, F2]\). This example of recursive forecast is summarized in the following table:
Horizon | Autoregressive values | Forecast |
---|---|---|
1 | 7, 11, 16 | F1 |
2 | 9, 16, F1 | F2 |
3 | 11, F1, F2 | F3 |
The recursive approach for forecasting several values into the future is applied in classical statistical models such as ARIMA or exponential smoothing.
The utsf package makes it easy the use of classical
regression models for univariate time series forecasting employing the
autoregressive approach and the recursive prediction strategy explained
in the previous section. All the supported models are applied using an
uniform interface: the forecast()
function. Let us see an
example in which a regression tree model is used to forecast the next
future values of a time series:
In this example, an autoregressive tree model
(method = "rt"
) is trained using the historical values of
the AirPassengers
time series and a forecast for its 12
next future values (h = 12
) is done. The
forecast()
function returns an S3 object of class
utsf
with information about the trained model and the
forecast. The information about the forecast is included in the
component pred
as an object of class ts
(a
time series):
f$pred
#> Jan Feb Mar Apr May Jun Jul Aug
#> 1961 460.2980 428.4915 467.0304 496.9833 499.9819 554.7891 627.5849 628.0503
#> Sep Oct Nov Dec
#> 1961 533.2803 482.4221 448.7926 453.6920
The training set used to fit the model is built from the historical
values of the time series using the autoregressive approach explained in
the previous section. The lags
parameter of the
forecast()
function is used to specify the autoregressive
lags. In the example: lags = 1:12
, so a target is a
function of its 12 previous values. Next, we consult the first targets
(and their associated features) with which the regression model has been
trained:
head(f$targets) # first targets
#> [1] -11.6666667 -0.9166667 13.4166667 6.6666667 -3.8333333 19.8333333
head(f$features) # and its associated features
#> Lag12 Lag11 Lag10 Lag9 Lag8 Lag7 Lag6
#> 1 -14.6666667 -8.666667 5.333333 2.333333 -5.666667 8.333333 21.333333
#> 2 -8.9166667 5.083333 2.083333 -5.916667 8.083333 21.083333 21.083333
#> 3 4.4166667 1.416667 -6.583333 7.416667 20.416667 20.416667 8.416667
#> 4 0.6666667 -7.333333 6.666667 19.666667 19.666667 7.666667 -9.333333
#> 5 -7.8333333 6.166667 19.166667 19.166667 7.166667 -9.833333 -24.833333
#> 6 5.8333333 18.833333 18.833333 6.833333 -10.166667 -25.166667 -11.166667
#> Lag5 Lag4 Lag3 Lag2 Lag1
#> 1 21.333333 9.333333 -7.666667 -22.666667 -8.666667
#> 2 9.083333 -7.916667 -22.916667 -8.916667 -11.916667
#> 3 -8.583333 -23.583333 -9.583333 -12.583333 -1.583333
#> 4 -24.333333 -10.333333 -13.333333 -2.333333 12.666667
#> 5 -10.833333 -13.833333 -2.833333 12.166667 6.166667
#> 6 -14.166667 -3.166667 11.833333 5.833333 -4.166667
Using the example of the previous section:
The forecast()
function provides a common interface to
applying an autoregressive approach for time series forecasting using
different regression models. These models are implemented in several R
packages. Currently, the forecast()
function is mainly
focused on regression tree models, supporting the following
approaches:
FNN::knn.reg()
is used, as regression function, to
recursively predict the future values of the time series.rpart::rpart()
and its associated method
rpart::predict.rpart()
is applied recursively for the
forecasts, i.e., as regression function.Cubist::cubist()
and its associated method
Cubist::predict.cubist()
is used for predictions.ipred::bagging()
and its associated method
ipred::predict.regbagg()
is used for forecasting.ranger::ranger()
and its associated method
ranger::predict.ranger()
is used for predictions.The S3 object of class utsf
returned by the
forecast()
function contains a component with the trained
autoregressive model:
f <- forecast(fdeaths, h = 12, lags = 1:12, method = "rt")
f$model
#> n= 60
#>
#> node), split, n, deviance, yval
#> * denotes terminal node
#>
#> 1) root 60 1967124.00 -6.465278
#> 2) Lag12< 73 38 212851.90 -124.414500
#> 4) Lag6>=-66.45833 30 57355.07 -153.847200
#> 8) Lag12< -170.7083 10 13293.09 -195.991700 *
#> 9) Lag12>=-170.7083 20 17419.67 -132.775000 *
#> 5) Lag6< -66.45833 8 32051.01 -14.041670 *
#> 3) Lag12>=73 22 312482.00 197.265200
#> 6) Lag5>=-131.7917 7 24738.12 114.500000 *
#> 7) Lag5< -131.7917 15 217416.40 235.888900 *
In this case, the model is the result of training a regression tree
using the function rpart::rpart()
with the training set
consisting of the features f$features
and targets
f$targets
. Once the model is trained, the
rpart::predict.rpart()
function is used recursively to
forecast the future values of the time series.
One interesting feature of the utsf package is that
you can use the forecast()
function to apply your own
regression models for time series forecasting in an autoregressive way.
Thus, your regression models can benefit from the features implemented
in the forecast()
function, such as preprocessing,
parameter tuning, the building of the training set, the implementation
of recursive forecasts or the estimation of the forecast accuracy of the
model.
To apply your own regression model with the forecast()
function you have to use the method
parameter, providing a
function that is able to train your model. This function should return
an object with the trained regression model. Also, it must have at least
two input parameters:
X
: it is a data frame with the features of the training
examples. This data frame is built from the time series taking into
account the autoregressive lags as explained in a previous section. This
is the same object as the features
component of the object
returned by the forecast()
function.y
: a vector with the targets of the training examples.
It is built as explained in a previous section. It is the same object as
the targets
component of the object returned by the
forecast()
function.Furthermore, if the function that trains the model (the function
provided in the method
parameter) returns a model of class
model_class
, a method with the signature
predict.model_class(object, new_value)
should be
implemented. This method uses your model to predict a new value, that
is, it is the regression function associated with the model.
Let us see an example in which the forecast()
function
is used to forecast a time series using the k-nearest neighbors
regression model implemented in the package FNN:
# Function to train the regression model
my_knn_model <- function(X, y, k = 3) {
structure(list(X = X, y = y, k = k), class = "my_knn")
}
# Function to predict a new example
predict.my_knn <- function(object, new_value) {
FNN::knn.reg(train = object$X, test = new_value,
y = object$y, k = object$k)$pred
}
f <- forecast(AirPassengers, h = 12, lags = 1:12, method = my_knn_model)
print(f$pred)
#> Jan Feb Mar Apr May Jun Jul Aug
#> 1961 455.9167 434.3264 480.7703 490.1678 506.1262 568.0534 640.6689 640.8636
#> Sep Oct Nov Dec
#> 1961 549.5467 495.4255 441.6554 476.7934
The new_value
parameter of the predict
method receives a data frame with the same structure as the
X
parameter of the function for building the model. The
new_value
data frame only has one row, with the features of
the example to be predicted.
The k-nearest neighbors algorithm is so simple that it can be easily implemented without using functionality of any R package:
# Function to train the regression model
my_knn_model2 <- function(X, y, k = 3) {
structure(list(X = X, y = y, k = k), class = "my_knn2")
}
# Function to predict a new example
predict.my_knn2 <- function(object, new_value) {
distances <- sapply(1:nrow(object$X), function(i) sum((object$X[i, ] - new_value)^2))
k_nearest <- order(distances)[1:object$k]
mean(object$y[k_nearest])
}
f2 <- forecast(AirPassengers, h = 12, lags = 1:12, method = my_knn_model2)
print(f2$pred)
#> Jan Feb Mar Apr May Jun Jul Aug
#> 1961 455.9167 434.3264 480.7703 490.1678 506.1262 568.0534 640.6689 640.8636
#> Sep Oct Nov Dec
#> 1961 549.5467 495.4255 441.6554 476.7934
Finally, we are going to forecast an artificial time series with a trend by means of a simple linear regression model.
set.seed(7)
t <- 1:15 + rnorm(15, sd = 0.5) # time series
my_lm <- function(X, y) lm(y ~ ., data = data.frame(cbind(X, y = y)))
f <- forecast(t, h = 5, lags = 1, method = my_lm, preProcess = NULL)
library(ggplot2)
autoplot(f)
In this case, we rely on the predict.lm
method to
predict new values. Let us see, the model:
f$model
#>
#> Call:
#> lm(formula = y ~ ., data = data.frame(cbind(X, y = y)))
#>
#> Coefficients:
#> (Intercept) Lag1
#> 0.7914 1.0251
The forecast of a future value is computed as \(0.7914 + 1.0251Lag1\), where
Lag1
is the previous value to the future value being
forecast.
Normally, a regression model can be adjusted using different
parameters. By default, the models supported by the
forecast()
function are set using some specific parameters,
usually the default values of the functions used to train the models
(these functions are listed in a previous section). However, the user
can set the parameters used to train the regression models with the
param
argument of the forecast()
function. The
param
argument must be a list with the names and values of
the parameters to be set. Let us see an example:
# A bagging model set with default parameters
f <- forecast(AirPassengers, h = 12, lags = 1:12, method = "bagging")
length(f$model$mtrees) # number of regression trees (25 by default)
#> [1] 25
# A bagging model set with 3 regression tress
f <- forecast(AirPassengers, h = 12,
lags = 1:12,
method = "bagging",
param = list(nbagg = 3)
)
length(f$model$mtrees) # number of regression trees
#> [1] 3
In the previous example, two bagging models (using regression trees)
are trained with the forecast()
function. In the first
model the number of trees is 25, the default value of the function
ipred::ipredbagg()
used to train the model. In the second
model the number of trees is set to 3. Of course, in order to set some
specific parameters the user must consult the arguments of the function
used internally by the forecast()
function to train the
model. In the example, ipred::ipredbagg()
.
In the following example the user sets the parameters of a regression model implemented by himself/herself:
# Function to train the model
my_knn_model <- function(X, y, k = 3) {
structure(list(X = X, y = y, k = k), class = "my_knn")
}
# Regression function for object of class my_knn
predict.my_knn <- function(object, new_value) {
FNN::knn.reg(train = object$X, test = new_value,
y = object$y, k = object$k)$pred
}
# The model is trained with default parameters (k = 3)
f <- forecast(AirPassengers, h = 12, lags = 1:12, method = my_knn_model)
print(f$model$k)
#> [1] 3
This section explains how to estimate the forecast accuracy of a regression model predicting a time series with the utsf package. Let us see an example:
f <- forecast(UKgas, h = 4, lags = 1:4, method = "knn", efa = "fixed")
f$efa
#> MAE MAPE sMAPE RMSE
#> 49.304102 7.226569 7.499436 53.862046
To assess the forecast accuracy of a regression model you can use the
forecast()
function to specify the regression task, using
the efa
parameter to choose how the forecast accuracy is
estimated. In this case a k- nearest neighbors model is used
(model = "knn"
) with the autoregressive lags 1 to 4 and an
estimation of its forecast accuracy on the UKgas
time
series for a forecast horizon of 4 (h = 4
) is obtained
using a fixed origin strategy. The result of this estimation can be
found in the efa
component of the object returned by the
forecast()
function. It is a vector with estimates of
forecast errors according to different forecast accuracy measures. For
instance, in the example, the estimated mean absolute error (MAE) for
horizon 4 is 49.3 approximately. Currently, the following forecasting
accuracy measures are computed:
Next, we describe how the forecasting accuracy measures are computed for a forecasting horizon \(h\) (\(y_t\) and \(\hat{y}_t\) are the actual future value and its forecast for horizon \(t\) respectively):
\[ MAE = \frac{1}{h}\sum_{t=1}^{h} |y_t-\hat{y}_t| \]
\[ MAPE = \frac{1}{h}\sum_{t=1}^{h} 100\frac{|y_t-\hat{y}_t|}{y_t} \] \[ sMAPE = \frac{1}{h}\sum_{t=1}^{h} 200\frac{\left|y_{t}-\hat{y}_{t}\right|}{|y_t|+|\hat{y}_t|} \]
\[ RMSE = \sqrt{\frac{1}{h}\sum_{t=1}^{h} (y_t-\hat{y}_t)^2} \]
The efa
parameter can be:
"fixed"
: In that case a fixed origin evaluation is
followed to estimate the forecast error. The test set is formed by the
last h
observations of the time series and the training set
is formed by its previous values."rolling"
: The rolling origin strategy is applied.
First, rolling origin uses a test set formed by the last h
historical values and a training set consisting of the previous
historical values. This procedure is repeated \(h -1\) times, moving the origin of the test
set one value ahead each time. This way, h
one step ahead
predictions can be assessed, \(h - 1\)
two steps ahead predictions, and so on. This is a great contrast with
the fixed origin evaluation that, for a validation set of size
h
, is only able to assess one prediction for each one of
the h
horizons. However, rolling origin evaluation is more
computationally intensive than fixed origin evaluation.Another useful feature of the utsf package is
parameter tuning. The forecast()
function allows to
estimate the forecast accuracy of a model using different combinations
of parameters. Furthermore, the best combination of parameters is used
to train the model with all the historical values of the series and
forecast the future values of the series. Let us see an example:
f <- forecast(UKgas, h = 4, lags = 1:4, method = "knn",
tuneGrid = expand.grid(k = 1:7), efa = "fixed")
f$tuneGrid
#> k MAE MAPE sMAPE RMSE
#> 1 1 26.39307 3.781212 3.725138 34.51027
#> 2 2 49.04570 7.209276 7.447296 52.11319
#> 3 3 49.30410 7.226569 7.499436 53.86205
#> 4 4 46.82356 6.604359 6.825456 53.25469
#> 5 5 53.05555 7.267617 7.538089 62.05870
#> 6 6 55.16359 7.104669 7.347932 66.33756
#> 7 7 48.01931 5.719260 5.906827 62.99989
In this example, the tuneGrid
parameter is used to
specify (using a data frame) the set of parameters to assess. The
forecast accuracy of the model using the different combinations of
parameters is estimated as explained in the previous section using the
last observations of the time series as validation set. The
efa
parameter is used to specify whether fixed or rolling
origin evaluation is applied. The tuneGrid
component of the
object returned by the forecast()
function contains the
result of the estimation. In this case, the k-nearest neighbors method
using \(k=1\) obtains the best results
for all the forecast accuracy measures. The best combination of
parameters according to RMSE is used to forecast the time series:
Let us plot the values of \(k\) against their estimated accuracy using RMSE:
Sometimes, the forecast accuracy of a model can be improved by
applying some transformations and/or preprocessings to the time series
being forecast. Currently, the utsf package is focused
on preprocessings related to forecasting time series with a trend. This
is due to the fact that, at present, the models incorporated in the
forecast()
function (such as regression trees and k-nearest
neighbors) predict averages of the targets in the training set (i.e., an
average of historical values of the series). In a trended series these
averages are probably outside the range of the future values of the
series. Let us see an example in which a trended series (the yearly
revenue passenger miles flown by commercial airlines in the United
States) is forecast using a random forest model:
In this case, no preprocessing is applied
(preProcess = NULL
). It can be observed how the forecast
does not capture the trending behavior of the series because regression
tree models predict averaged values of the training targets.
In the next subsections we explain how to deal with trended series using three different transformations.
A common way of dealing with trended series is to transform the original series taking first differences (computing the differences between consecutive observations). Then, the model is trained with the differenced series and the forecast are back-tranformed. Let us see an example:
f <- forecast(airmiles, h = 4, lags = 1:4, method = "rf", preProcess = list(differences(1)))
autoplot(f)
In this case, the preProcess
parameter has been used to
specify first-order differences. The preProcess
parameter
consists of a list of preprocessings or transformations. Currently, only
one preprocessing can be specified. The differences are specified using
the differences()
function. This function takes one
parameter with the order of first differences (normally 1). It is also
possible to specify the value -1, in which case the order of differences
is estimated using the ndiffs()
function from the
forecast package (in that case the chosen order of
first differences could be 0, that is, no differences are applied).
This transformation has been used to deal with trending series in other packages, such as tsknn and tsfgrnn. The additive transformation works transforming the training examples as follows:
It is easy to check how the training examples are modified by the additive transformation using the API of the package. For example, let us see the training examples of a model with no transformation:
timeS <- ts(c(1, 3, 7, 9, 10, 12))
f <- forecast(timeS, h = 1, lags = 1:2, preProcess = NULL)
cbind(f$features, Targets = f$targets)
#> Lag2 Lag1 Targets
#> 1 1 3 7
#> 2 3 7 9
#> 3 7 9 10
#> 4 9 10 12
Now, let us see the effect of the additive transformation in the training examples:
timeS <- ts(c(1, 3, 7, 9, 10, 12))
f <- forecast(timeS, h = 1, lags = 1:2, preProcess = list("additive"))
cbind(f$features, Targets = f$targets)
#> Lag2 Lag1 Targets
#> 1 -1.0 1.0 5.0
#> 2 -2.0 2.0 4.0
#> 3 -1.0 1.0 2.0
#> 4 -0.5 0.5 2.5
Finally, we forecast the airmiles series using the additive tranformation and a random forest model:
The multiplicative transformation is similar to the additive transformation, but it is intended for series with an exponential trend (the additive transformation is suited for series with a linear trend). In the multiplicative transformation a training example is transformed this way:
Let us see an example of an artificial time series with a multiplicative trend and its forecast using the additive and the multiplicative transformation:
t <- ts(10 * 1.05^(1:20))
f_m <- forecast(t, h = 4, lags = 1:3, method = "rf", preProcess = list("multiplicative"))
f_a <- forecast(t, h = 4, lags = 1:3, method = "rf", preProcess = list("additive"))
library(vctsfr)
plot_predictions(t, predictions = list(Multiplicative = f_m$pred, Additive = f_a$pred))
In this case, the forecast with the multiplicative transformation captures perfectly the exponential trend.
The forecast()
function only has two compulsory
parameters: the time series being forecast and the forecast horizon.
Next, we discuss the default values for the rest of its parameters:
lags
: The lags
parameter is an integer
vector (in increasing order) with the autoregressive lags. If
frequency(ts) == f
where ts
is the time series
being forecast and \(f > 1\) then
the lags used as autoregressive features are 1:f. For example,
the lags for quarterly data are 1:4 and for monthly data 1:12. This is
useful to capture a possible seasonal behavior of the series. If
frequency(ts) == 1
, then:
stats::pacf()
) are selected.method
: By default, the k-nearest neighbors algorithm
is applied.param
: By default, the model is trained using some
sensible parameters, normally the default values of the function used to
train the model.preProcess
: The additive transformation is applied.
This transformation have proved its effectiveness to forecast trending
series. In general, its application seems beneficial on any kind of
series.efa
: No estimation of forecast accuracy is done.tuneGrid
: No estimation of forecast accuracy is
done.