This vignette illustrates the usage of the package
portfolioBacktest
for automated portfolio backtesting over multiple datasets on a rolling-window basis. It can be used by a researcher/practitioner to backtest a set of different portfolios, as well as a course instructor to assess the students in their portfolio design in a fully automated and convenient manner. The results can be nicely formatted in tables and plots.
Backtesting is a dangerous task fraught with many potential pitfalls (Luo et al. 2014). By performing a large number of randomized backtests, instead of visually inspecting a single backtest, one can obtain more realistic results.
This package backtests a list of portfolios over multiple datasets on a rolling-window basis (aka walk forward), producing final results as in the following.
Do the backtest on your own portfolio following few steps:
library(portfolioBacktest)
data("dataset10")
<- function(dataset, ...) {
my_portfolio <- dataset$adjusted
prices <- ncol(prices)
N return(rep(1/N, N))
}
<- portfolioBacktest(my_portfolio, dataset10)
bt #> Backtesting 1 portfolios over 10 datasets (periodicity = daily data)...
backtestSummary(bt)$performance
#> fun1
#> Sharpe ratio 1.476203e+00
#> max drawdown 8.937890e-02
#> annual return 1.594528e-01
#> annual volatility 1.218623e-01
#> Sortino ratio 2.057677e+00
#> downside deviation 8.351402e-02
#> Sterling ratio 2.122653e+00
#> Omega ratio 1.295090e+00
#> VaR (0.95) 1.101934e-02
#> CVaR (0.95) 1.789425e-02
#> rebalancing period 1.000000e+00
#> turnover 8.641594e-03
#> ROT (bps) 7.334458e+02
#> cpu time 1.615385e-03
#> failure rate 0.000000e+00
The package can be installed from CRAN or GitHub:
# install stable version from CRAN
install.packages("portfolioBacktest")
# install development version from GitHub
::install_github("dppalomar/portfolioBacktest")
devtools
# Getting help
library(portfolioBacktest)
help(package = "portfolioBacktest")
?portfolioBacktest
The main function portfolioBacktest()
requires the argument dataset_list
to follow a certain format: it should be a list of several individual datasets, each of them being a list of several xts
objects following exactly the same date index. One of those xts
objects must contain the historical prices of the stocks, but we can have additional xts
objects containing other information such as volume of the stocks or index prices. The package contains a small dataset sample for illustration purposes:
data("dataset10") # load the embedded dataset
class(dataset10) # show dataset class
#> [1] "list"
names(dataset10[1:3]) # show names of a few datasets
#> [1] "dataset 1" "dataset 2" "dataset 3"
names(dataset10$`dataset 1`) # structure of one dataset
#> [1] "adjusted" "index"
head(dataset10$`dataset 1`$adjusted[, 1:3])
#> MAS.Adjusted MGM.Adjusted CMI.Adjusted
#> 2015-04-24 22.05079 21.34297 121.8492
#> 2015-04-27 22.13499 21.26537 124.3041
#> 2015-04-28 22.68226 21.69223 122.5187
#> 2015-04-29 22.54755 20.47956 123.2150
#> 2015-04-30 22.30337 20.51836 123.4203
#> 2015-05-01 22.85065 20.76090 125.9823
Note that each dataset contains an xts
object called "adjusted"
(adjusted prices). By default, portfolioBacktest()
will use such adjusted prices to calculate the portfolio return. But one can change this setting with the argument price_name
in function portfolioBacktest()
.
We emphasize that 10 datasets are not enough for properly backtesting portfolios. In this package, we provide the function stockDataDownload()
to download online data resources in the required data format. Then, the function financialDataResample()
can help resample the downloaded data into multiple datasets (each resample is obtained by randomly choosing a subset of the stock names and randomly choosing a time period over the available long period), which can be directly passed to portfolioBacktest()
. We recommend using these two functions to generate multiple datasets for serious backtesting:
data(SP500_symbols) # load the SP500 symbols
# download data from internet
<- stockDataDownload(stock_symbols = SP500_symbols,
SP500 from = "2008-12-01", to = "2018-12-01")
# resample 10 times from SP500, each with 50 stocks and 2-year consecutive data
<- financialDataResample(SP500,
my_dataset_list N_sample = 50, T_sample = 252*2,
num_datasets = 10)
Each individual dataset will contain 7 xts
objects with names: open
, high
, low
, close
, volume
, adjusted
, index
. Since the function stockDataDownload()
may take a long time to download the data from the Internet, it will automatically save the data into a local file for subsequent fast retrieval (whenever the function is called with the same arguments). It is the responsibility of the user to download a proper universe of stocks to avoid survivorship bias.
Additional data can be helpful in designing portfolios. One can add as many other xts
objects in each dataset as desired. For example, if the Moving Average Convergence Divergence (MACD) information is needed by the portfolio functions, one can manually add it to the dataset as follows:
for (i in 1:length(dataset10))
$MACD <- apply(dataset10[[i]]$adjusted, 2,
dataset10[[i]]function(x) { TTR::MACD(x)[ , "macd"] })
A portfolio has to be defined in the form of function that takes as input:
xts
objects (following the format of the elements of the argument dataset_list
) andw_current
(if this argument is not used, then alternatively one can use the ellipsis ...
in the function definition).The portfolio function has to return the portfolio as a numerical vector of normalized weights of the same length as the number of stocks.
Below we give the examples for the quintile portfolio, the global minimum variance portfolio (GMVP), and the Markowitz mean-variance portfolio (under practical constraints \(\mathbf{w} \ge \mathbf{0}\) and \(\mathbf{1}^{T} \mathbf{w} =1\)):
# define quintile portfolio
<- function(dataset, w_current) {
quintile_portfolio_fun <- diff(log(dataset$adjusted))[-1] # compute log returns
X <- ncol(X)
N # design quintile portfolio
<- sort(colMeans(X), decreasing = TRUE, index.return = TRUE)$ix
ranking <- rep(0, N)
w 1:round(N/5)]] <- 1/round(N/5)
w[ranking[return(w)
}
# define GMVP (with heuristic not to allow shorting)
<- function(dataset, ...) {
GMVP_portfolio_fun <- diff(log(dataset$adjusted))[-1] # compute log returns
X <- cov(X) # compute SCM
Sigma # design GMVP
<- solve(Sigma, rep(1, nrow(Sigma)))
w <- abs(w)/sum(abs(w))
w return(w)
}
# define Markowitz mean-variance portfolio
library(CVXR)
<- function(dataset, ...) {
Markowitz_portfolio_fun <- diff(log(dataset$adjusted))[-1] # compute log returns
X <- colMeans(X) # compute mean vector
mu <- cov(X) # compute the SCM
Sigma # design mean-variance portfolio
<- Variable(nrow(Sigma))
w <- Problem(Maximize(t(mu) %*% w - 0.5*quad_form(w, Sigma)),
prob constraints = list(w >= 0, sum(w) == 1))
<- solve(prob)
result return(as.vector(result$getValue(w)))
}
The argument w_current
can be used to control the transaction cost:
<- function(dataset, w_current) {
Markowitz_portfolio_tc_fun <- 0.01
tau <- diff(log(dataset$adjusted))[-1] # compute log returns
X <- colMeans(X) # compute mean vector
mu <- cov(X) # compute the SCM
Sigma # design mean-variance portfolio
<- Variable(nrow(Sigma))
w <- Problem(Maximize(t(mu) %*% w - 0.5*quad_form(w, Sigma) -
prob *sum(abs(w - w_current))),
tauconstraints = list(w >= 0, sum(w) == 1))
<- solve(prob)
result return(as.vector(result$getValue(w)))
}
With the datasets and portfolios ready, we can now do the backtest easily. For example, to obtain the three portfolios’ performance over the datasets, we just need combine them in a list and run the backtest in one line:
<- list("Quintile" = quintile_portfolio_fun,
portfolios "GMVP" = GMVP_portfolio_fun,
"Markowitz" = Markowitz_portfolio_fun)
<- portfolioBacktest(portfolios, dataset10, benchmark = c("1/N", "index"))
bt #> Backtesting 3 portfolios over 10 datasets (periodicity = daily data)...
#> Backtesting benchmarks...
Here bt
is a list storing all the backtest results according to the passed functions list (plus the two benchmarks):
names(bt)
#> [1] "Quintile" "GMVP" "Markowitz" "1/N" "index"
Each element of bt
is also a list storing more information for each of the datasets:
#> levelName
#> 1 bt
#> 2 ¦--Quintile
#> 3 ¦ ¦--dataset 1
#> 4 ¦ ¦ ¦--performance
#> 5 ¦ ¦ ¦--cpu_time
#> 6 ¦ ¦ ¦--error
#> 7 ¦ ¦ ¦--error_message
#> 8 ¦ ¦ ¦--w_optimized
#> 9 ¦ ¦ ¦--w_rebalanced
#> 10 ¦ ¦ ¦--w_bop
#> 11 ¦ ¦ ¦--return
#> 12 ¦ ¦ ¦--wealth
#> 13 ¦ ¦ °--X_lin
#> 14 ¦ ¦--dataset 2
#> 15 ¦ ¦ ¦--performance
#> 16 ¦ ¦ ¦--cpu_time
#> 17 ¦ ¦ ¦--error
#> 18 ¦ ¦ ¦--error_message
#> 19 ¦ ¦ ¦--w_optimized
#> 20 ¦ ¦ °--... 5 nodes w/ 0 sub
#> 21 ¦ °--... 8 nodes w/ 85 sub
#> 22 °--... 4 nodes w/ 533 sub
One can extract any desired backtest information directly from the returned variable bt
.
The package also contains several convenient functions to extract information from the backtest results.
# select sharpe ratio and max drawdown performance of Quintile portfolio
backtestSelector(bt, portfolio_name = "Quintile",
measures = c("Sharpe ratio", "max drawdown"))
#> $performance
#> Sharpe ratio max drawdown
#> dataset 1 1.06419447 0.09384104
#> dataset 2 1.22091716 0.10406013
#> dataset 3 2.24635921 0.06952085
#> dataset 4 1.44699083 0.09921398
#> dataset 5 0.08849001 0.17328255
#> dataset 6 0.99108926 0.10320105
#> dataset 7 1.64175055 0.08836202
#> dataset 8 -0.10916655 0.27460141
#> dataset 9 1.62468886 0.11288730
#> dataset 10 1.37221717 0.09834212
# show the portfolios performance in tables
backtestTable(bt, measures = c("Sharpe ratio", "max drawdown"))
#> $`Sharpe ratio`
#> Quintile GMVP Markowitz 1/N index
#> dataset 1 1.06419447 1.32027278 0.001373386 1.4089909 1.33623612
#> dataset 2 1.22091716 0.16541826 1.044875366 0.4355269 0.22256998
#> dataset 3 2.24635921 1.87705877 1.149650946 2.2566129 1.79107233
#> dataset 4 1.44699083 1.12233673 0.331281647 1.2145246 0.95372278
#> dataset 5 0.08849001 0.05190026 0.061763031 0.3137457 0.20553014
#> dataset 6 0.99108926 2.08192072 0.767195436 1.7823589 2.49533696
#> dataset 7 1.64175055 2.69917968 -0.251919230 2.3238009 1.58760559
#> dataset 8 -0.10916655 0.16661653 1.075091650 0.1975010 0.03506698
#> dataset 9 1.62468886 1.27456766 0.779205761 1.5434145 1.37981616
#> dataset 10 1.37221717 1.96674860 0.483356334 1.8760481 1.72522587
#>
#> $`max drawdown`
#> Quintile GMVP Markowitz 1/N index
#> dataset 1 0.09384104 0.05733409 0.20824654 0.06678369 0.05595722
#> dataset 2 0.10406013 0.13027178 0.23314406 0.13218930 0.12352525
#> dataset 3 0.06952085 0.04947330 0.16813539 0.04800769 0.05761261
#> dataset 4 0.09921398 0.10466982 0.15162953 0.10861575 0.10159531
#> dataset 5 0.17328255 0.08719220 0.64819903 0.11546985 0.10159531
#> dataset 6 0.10320105 0.02596655 0.33947368 0.03869864 0.02796792
#> dataset 7 0.08836202 0.05441919 0.27695995 0.05440115 0.07671058
#> dataset 8 0.27460141 0.16788147 0.28835512 0.19664607 0.17904681
#> dataset 9 0.11288730 0.10417538 0.24967353 0.10097014 0.10159531
#> dataset 10 0.09834212 0.05701569 0.09785333 0.07778765 0.05595722
<- backtestSummary(bt)
res_sum names(res_sum)
#> [1] "performance_summary" "error_message"
$performance_summary
res_sum#> Quintile GMVP Markowitz 1/N index
#> Sharpe ratio 1.29656717 1.297420e+00 0.62527589 1.476203e+00 1.35802614
#> max drawdown 0.10120752 7.226314e-02 0.24140880 8.937890e-02 0.08915294
#> annual return 0.19595757 1.441586e-01 0.20402853 1.594528e-01 0.14822709
#> annual volatility 0.16224595 1.107614e-01 0.31015587 1.218623e-01 0.12422862
#> Sortino ratio 1.90905709 1.815420e+00 0.85264675 2.057677e+00 1.90434670
#> downside deviation 0.11132931 7.959349e-02 0.21019793 8.351402e-02 0.08843501
#> Sterling ratio 1.92937203 1.954336e+00 1.20389040 2.122653e+00 2.02000933
#> Omega ratio 1.24641783 1.257818e+00 1.12755743 1.295090e+00 1.28610811
#> VaR (0.95) 0.01608051 9.641547e-03 0.02896669 1.101934e-02 0.01228986
#> CVaR (0.95) 0.02387980 1.669202e-02 0.04233197 1.789425e-02 0.01937577
#> rebalancing period 1.00000000 1.000000e+00 1.00000000 1.000000e+00 252.00000000
#> turnover 0.02944171 2.131205e-02 0.03293795 8.641594e-03 0.00000000
#> ROT (bps) 254.70011596 2.304439e+02 195.65068681 7.334458e+02 NA
#> cpu time 0.00200000 2.346154e-03 0.22757692 1.538462e-03 0.00100000
#> failure rate 0.00000000 0.000000e+00 0.00000000 0.000000e+00 0.00000000
For more flexible usage of these functions, one can refer to the help pages of these functions.
Besides, the package also provides some functions to show results in tables and figures.
summaryTable(res_sum, type = "DT", order_col = "Sharpe ratio", order_dir = "desc")
summaryTable()
in a visual way):summaryBarPlot(res_sum, measures = c("Sharpe ratio", "max drawdown"))
backtestBoxPlot(bt, measure = "Sharpe ratio")
backtestChartCumReturn(bt, c("Quintile", "GMVP", "index"))
backtestChartDrawdown(bt, c("Quintile", "GMVP", "index"))
# for better illustration, let's use only the first 5 stocks
<- lapply(dataset10,
dataset10_5stocks function(x) {x$adjusted <- x$adjusted[, 1:5]; return(x)})
# backtest
<- portfolioBacktest(list("GMVP" = GMVP_portfolio_fun), dataset10_5stocks,
bt rebalance_every = 20)
#> Backtesting 1 portfolios over 10 datasets (periodicity = daily data)...
# chart
backtestChartStackedBar(bt, "GMVP", legend = TRUE)
By default, transaction costs are not included in the backtesting, but the user can easily specify the cost to be used for a more realistic backtesting:
library(ggfortify)
# backtest without transaction costs
<- portfolioBacktest(my_portfolio, dataset10)
bt
# backtest with costs of 15 bps
<- portfolioBacktest(my_portfolio, dataset10,
bt_tc cost = list(buy = 15e-4, sell = 15e-4))
# plot wealth time series
<- cbind(bt$fun1$`dataset 1`$wealth, bt_tc$fun1$`dataset 1`$wealth)
wealth colnames(wealth) <- c("without transaction costs", "with transaction costs")
autoplot(wealth, facets = FALSE, main = "Wealth") +
theme(legend.title = element_blank()) +
theme(legend.position = c(0.8, 0.2)) +
scale_color_manual(values = c("red", "black"))
When performing the backtest of the designed portfolio functions, one may want to incorporate some benchmarks. The package currently suppports two benchmarks: 1/N
portfolio and index
of the market. (Note that to incorporate the index
benchmark each dataset needs to contain one xts
object named index
.) Once can easily choose the benchmarks by passing the corresponding value to argument benchmark
:
<- portfolioBacktest(portfolios, dataset10, benchmark = c("1/N", "index"))
bt #> Backtesting 3 portfolios over 10 datasets (periodicity = daily data)...
#> Backtesting benchmarks...
names(bt)
#> [1] "Quintile" "GMVP" "Markowitz" "1/N" "index"
Portfolio functions usually contain some parameters that can be tuned. One can manually generate different versions of such portfolio functions with a variety of parameters. Fortunately, the function genRandomFuns()
helps with this task by automatically generating different versions of the portfolios with randomly chosen paramaters:
# define a portfolio with parameters "lookback", "quintile", and "average_type"
<- function(dataset, ...) {
quintile_portfolio_fun <- tail(dataset$adjusted, lookback)
prices <- diff(log(prices))[-1]
X <- switch(average_type,
mu "mean" = colMeans(X),
"median" = apply(X, MARGIN = 2, FUN = median))
<- sort(mu, decreasing = TRUE, index.return = TRUE)$ix
idx <- rep(0, ncol(X))
w 1:ceiling(quintile*ncol(X))]] <- 1/ceiling(quintile*ncol(X))
w[idx[return(w)
}
# then automatically generate multiple versions with randomly chosen parameters
<- genRandomFuns(portfolio_fun = quintile_portfolio_fun,
portfolio_list params_grid = list(lookback = c(100, 120, 140, 160),
quintile = 1:5 / 10,
average_type = c("mean", "median")),
name = "Quintile",
N_funs = 40)
#> Generating 40 functions out of a total of 40 possible combinations.
names(portfolio_list[1:5])
#> [1] "Quintile (lookback=140, quintile=0.5, average_type=mean)"
#> [2] "Quintile (lookback=160, quintile=0.1, average_type=mean)"
#> [3] "Quintile (lookback=120, quintile=0.5, average_type=mean)"
#> [4] "Quintile (lookback=120, quintile=0.1, average_type=median)"
#> [5] "Quintile (lookback=120, quintile=0.4, average_type=median)"
1]]
portfolio_list[[#> function(dataset, ...) {
#> prices <- tail(dataset$adjusted, lookback)
#> X <- diff(log(prices))[-1]
#> mu <- switch(average_type,
#> "mean" = colMeans(X),
#> "median" = apply(X, MARGIN = 2, FUN = median))
#> idx <- sort(mu, decreasing = TRUE, index.return = TRUE)$ix
#> w <- rep(0, ncol(X))
#> w[idx[1:ceiling(quintile*ncol(X))]] <- 1/ceiling(quintile*ncol(X))
#> return(w)
#> }
#> <environment: 0x7fe974c77290>
#> attr(,"params")
#> attr(,"params")$lookback
#> [1] 140
#>
#> attr(,"params")$quintile
#> [1] 0.5
#>
#> attr(,"params")$average_type
#> [1] "mean"
Now we can proceed with the backtesting:
<- portfolioBacktest(portfolio_list, dataset10)
bt #> Backtesting 40 portfolios over 10 datasets (periodicity = daily data)...
Finally we can observe the performance for all combinations of parameters backtested:
plotPerformanceVsParams(bt)
#> Parameter grid:
#> lookback = c(100, 120, 140, 160)
#> quintile = c(0.1, 0.2, 0.3, 0.4, 0.5)
#> average_type = c("mean", "median")
#>
#> Parameter types: 0 fixed, 2 variable numeric, and 1 variable non-numeric.
In this case, we can conclude that the best combination is to use the median of the past 160 days and using the 0.3 top quintile. Extreme caution has to be taken when tuning hyper-parameter of strategies due to the danger of overfitting (Bailey et al. 2016).
In order to monitor the backtest progress, one can choose to turn on a progress bar by setting the argument show_progress_bar
:
<- portfolioBacktest(portfolios, dataset10, show_progress_bar = TRUE) bt
The backtesting typically incurs in a very heavy computational load when the number of portfolios or datasets is large (also depending on the computational cost of each portfolio function). The package contains support for parallel computational mode. Users can choose to evaluate different portfolio functions in parallel or, in a more fine-grained way, to evaluate multiple datasets in parallel for each function:
<- Markowitz_portfolio_fun
portfun
# parallel = 2 for functions
system.time(
<- portfolioBacktest(list(portfun, portfun), dataset10)
bt_noparallel
)#> Backtesting 2 portfolios over 10 datasets (periodicity = daily data)...
#> user system elapsed
#> 63.594 0.730 65.451
system.time(
<- portfolioBacktest(list(portfun, portfun), dataset10,
bt_parallel_funs paral_portfolios = 2)
)#> Backtesting 2 portfolios over 10 datasets (periodicity = daily data)...
#> user system elapsed
#> 0.689 0.201 38.423
# parallel = 5 for datasets
system.time(
<- portfolioBacktest(portfun, dataset10)
bt_noparallel
)#> Backtesting 1 portfolios over 10 datasets (periodicity = daily data)...
#> user system elapsed
#> 31.136 0.300 31.862
system.time(
<- portfolioBacktest(portfun, dataset10,
bt_parallel_datasets paral_datasets = 5)
)#> Backtesting 1 portfolios over 10 datasets (periodicity = daily data)...
#> user system elapsed
#> 1.538 0.377 21.089
It is obvious that the evaluation time for backtesting has been significantly reduced. Note that the parallel evaluation elapsed time will not be exactly equal to the original time divided by parallel cores because starting new R sessions also takes extra time. Besides, the two parallel modes can be simultaneous used.
Note that an unexpected error might be thrown out when running a parallel backtest through RStudio in macOS. If that happens, one can check the default parallel setting via:
:::getClusterOption("setup_strategy") parallel
If "parallel"
is returned, one can set the option setup_strategy
to "sequential"
:
:::setDefaultClusterOptions(setup_strategy = "sequential") parallel
The problem may be fixed. However, the “sequential” strategy might be less efficient than the “parallel” strategy.
In some cases, one may want to do initialize some variable at the beginning of each backtest and be able to access those variables during the rolling-window process. At the moment, the package does not support this initialization. However, there is a hack that can be used for the time being (via the use of non-recommended global variables):
<- 0 # initialize global variable to 0
allocation
<- function(dataset, ...) {
test_portfolio <- ncol(dataset$adjusted)
N
<- rep(allocation, N)
w <<- 1/N # after first time it becomes 1/N
allocation
return(w)
}
<- portfolioBacktest(list("test" = test_portfolio),
bt dataset_list = dataset10[1:2],
lookback = 100, optimize_every = 200,
paral_datasets = 2) # <--- this argument is necessary (has to be > 1)
#> Backtesting 1 portfolios over 2 datasets (periodicity = daily data)...
# sanity check
$test$`dataset 1`$w_optimized[, 1:2]
bt#> MAS MGM
#> 2015-09-15 0.00 0.00
#> 2016-06-30 0.02 0.02
#> 2017-04-18 0.02 0.02
$test$`dataset 2`$w_optimized[, 1:2]
bt#> XRX ULTA
#> 2014-03-04 0.00 0.00
#> 2014-12-16 0.02 0.02
#> 2015-10-02 0.02 0.02
Note that for this hack to work, one needs paral_datasets > 1
.
Execution errors during backtesting may happen unexpectedly when executing the different portfolio functions. Nevertheless, such errors are properly catched and bypassed by the backtesting function portfolioBacktest()
so that the execution of the overall backtesting is not stopped. For debugging purposes, to help the user trace where and when the execution errors happen, the result of the backtesting contains all the necessary information about the errors, including the call stack when a execution error happens. Such information is given as the attribute error_stack
of the returned error_message
.
For example, let’s define a portfolio function that will throw a error:
<- function(x) {
sub_function2 "a" + x # an error will happen here
}
<- function(x) {
sub_function1 return(sub_function2(x))
}
<- function(data, ...) {
wrong_portfolio_fun <- ncol(data$adjusted)
N <- rep(1/N, N)
uni_port return(sub_function1(uni_port))
}
Now, let’s pass the above portfolio function to portfolioBacktest()
and see how to check the error trace:
<- portfolioBacktest(wrong_portfolio_fun, dataset10)
bt #> Backtesting 1 portfolios over 10 datasets (periodicity = daily data)...
<- backtestSelector(bt, portfolio_index = 1)
res
# information of 1st error
<- res$error_message[[1]]
error1 str(error1)
#> chr "non-numeric argument to binary operator"
#> - attr(*, "error_stack")=List of 2
#> ..$ at : chr "\"a\" + x"
#> ..$ stack: chr "sub_function1(uni_port)\nsub_function2(x)"
# the exact location of error happening
cat(attr(error1, "error_stack")$at)
#> "a" + x
# the call stack of error happening
cat(attr(error1, "error_stack")$stack)
#> sub_function1(uni_port)
#> sub_function2(x)
In some situations, one may have to backtest portfolios from different sources stored in different files, e.g., students in a porftolio design course (in fact, this package was originally developed to assess students in the course “Portfolio Optimization with R” from the MSc in Financial Mathematics (MAFM)). In such cases, the different portfolios may have conflicting dependencies and loading all of them into the environment may not be a reasonable approach. The package adds support for backtesting portfolios given in individual files in a folder in a way that each is executed in a clean environment without affecting each other. It suffices to write each portfolio function into an R script (with unique filename) containing the portfolio function named exactly portfolio_fun()
as well as any other auxiliary functions that it may require (needless to say that the required packages should be loaded in that script with library()
). All theses files should be put into a file folder, whose path will be passed to the function portfolioBacktest()
with the argument folder_path
.
If an instructor wants to evaluate students of a course in their portfolio design, this can be easily done by asking each student to submit an R script with a unique filename like STUDENTNUMBER.R
. For example, suppose we have three files in the folder portfolio_files
named 0001.R
, 0002.R
, and 0003.R
. Then:
<- portfolioBacktest(folder_path = "portfolio_files",
bt_all_students source_to_local = FALSE,
dataset_list = dataset10)
#> Backtesting 3 portfolios over 10 datasets (periodicity = daily data)...
names(bt_all_students)
#> [1] "0001" "0002" "0003"
Note that if the package CVXR
is used in some of the files, it may not work depending on the version. A temporary workaround is to set the argument source_to_local = FALSE
in portfolioBacktest()
(the side effect is that the objects from the file will be loaded in the global environment).
Now we can rank the different portfolios/students based on a weighted combination of the rank percentiles (termed scores) of the performance measures:
<- backtestLeaderboard(bt_all_students,
leaderboard weights = list("Sharpe ratio" = 7,
"max drawdown" = 1,
"annual return" = 1,
"ROT (bps)" = 1))
# show leaderboard
library(gridExtra)
grid.table(leaderboard$leaderboard_scores)
Consider the student with id number 666. Then the script file should be named 666.R
and should contain the portfolio function called exactly portfolio_fun()
as well as any other auxiliary functions that it may require (and any required package loading with library()
):
library(CVXR)
<- function(x) {
auxiliary_function # here whatever code
}
<- function(data, ...) {
portfolio_fun <- as.matrix(diff(log(data$adjusted))[-1]) # compute log returns
X <- colMeans(X) # compute mean vector
mu <- cov(X) # compute the SCM
Sigma # design mean-variance portfolio
<- Variable(nrow(Sigma))
w <- Problem(Maximize(t(mu) %*% w - 0.5*quad_form(w, Sigma)),
prob constraints = list(w >= 0, sum(w) == 1))
<- solve(prob)
result return(as.vector(result$getValue(w)))
}
The performance criteria currently considered by default in the package are:
One can easily add new performance measures with the function add_performance()
.