syn-data-code

The purpose of this vignette is to share the code used to provide a walkthrough of the BayesECM() function in the ezECM package, as well as provide code to reproduce a figure from a related future publication. This document outlines the skeleton of a Monte Carlo experiment and provides the relevant code. Some parameter values are changed to reduce the computation time, the original values are noted in the text. Synthetic data is generated to use for testing and training different implementations of the Event Categorization Matrix (ECM) model for comparision. The comparators are classical ECM (C-ECM), Bayesian ECM (B-ECM) only trained on events where all discriminants are available, B-ECM where the model is trained on all available data including partial observations (M-B-ECM), and M-B-ECM where the loss matrix is changed such that the false negative rate is reduced. All of these comparators focus on binary categorization, simply detecting if a new observation belongs to a pre-specified important category. The last comparator categorizes new events into each of the K event categories used for training with the B-ECM model (B-ECM Cat). The current form of C-ECM cannot utilize partial observations for training, and therefore we hypothesize that the performance of C-ECM will suffer in comparision to B-ECM models which can utilize observations with missing data for training.

Data Generation

We will define some functions used to generate the synthetic data, which are not part of the ezECM package. These functions randomly generate a mean and covariance for each class. These random mean and covariance is then used to generate random data sets. Additionally, functions are used for randomly deleting data to generate partial observations. Details are provided in the comments, as well as the publication. The argument p specifies the number of discriminants, K changes the number of event categories, Ntest is the size of the testing set, Ntrain is the size of the training set which does not have any missing data, Ntrain_missing is the size of the training set where events have at least one missing discriminant, tst_missing controls the fraction of missing entries in the testing set, and accordingly trn_missing controls the fraction of the training set which is missing. In the following experiment we will only vary p.

data_gen <- function(p = NULL, K = 3, Ntest = NULL, Ntrain = NULL, 
                     Ntrain_missing = NULL, tst_missing = NULL, trn_missing = NULL){
  
    mu_use <- matrix(rnorm(n = p*K, sd = 0.5), ncol = K, nrow = p)
    Y <- list()
    S <- list()
    
    ## random number in each class
    
    N <- LaplacesDemon::rcat(n = Ntest + Ntrain + Ntrain_missing, 
                             p = rep(1/3, times = K))
    N <- as.vector(table(N))
    
    ## random very important category
    
    vic <- sample(1:K, size = 1)
    
    for(k in 1:K){
      
      ## random number of blocks in kth category
      
      nblocks <- sample(1:2, size = 1)
      
      ## random covariance matrix for kth category
      
      S[[k]] <- LaplacesDemon::rinvwishart(nu = p + 4, S = diag(p))
      
      
      ## If relevant, delete entries to form block covariance matrices of random sizes
      
      if(nblocks == 2){
        nblock1 <- sample(1:(p-1), size = 1)
        block1_members <- sample(1:p, size = nblock1, replace = FALSE)
        block2_members <- (1:p)[-block1_members]
        
        zero_elements <- expand.grid(block1_members, block2_members)
        
        S[[k]][zero_elements$Var1, zero_elements$Var2] <- 0
        S[[k]][zero_elements$Var2, zero_elements$Var1] <- 0
        
      }
      
      ## data for kth class is drawn from a MVN, given the mean and covariance
      Y[[k]] <- as.data.frame(LaplacesDemon::rmvn(n = N[k], mu = mu_use[,k], Sigma= S[[k]]))
      
      ## data is transformed to (0,1) using the logistic function to run a check
      Ytemp<- 1/(1+ exp(-Y[[k]]))
      
      ## if machine precision causes the output of the logistic function to round to 1, the experiment is stopped
      ## this has never happened
      if(max(apply(Ytemp,2,function(X){range(X)})) == 1){
        stop()
      }else{
        Y[[k]]<- 1/(1+ exp(-Y[[k]]))
      }
      
      ## a column for the event class is appended to the data.frame
      Y[[k]] <- cbind(Y[[k]], as.character(k))
      names(Y[[k]]) <- c(paste("p",as.character(1:p),  sep = ""), "event")
      
    }
    
    Y <- do.call("rbind", Y)
    
    ## A random sample of the data is taken to be the testing set
    test_index <- sample(1:nrow(Y), size = Ntest)
    testing <- Y[test_index,]
    
    ## The remainder is slated for training
    training <- Y[-test_index, ]
    
    ## A random sample of the training set is set aside to have missing entries,
    ## the remainder is set aside to be the fully observed training set
    train_full_index <- sample(1:nrow(training), size = Ntrain)
    train_missing <- training[-train_full_index, ]
    train_full <- training[train_full_index, ]
    
    ## If all event categories are not represented in the training set of full observations,
    ## the sampling scheme repeats
    if(any(table(train_full$event) <= 1) | length(table(train_full$event)) <= (K-1)){
      while(any(table(train_full$event) <= 1 | length(table(train_full$event)) <= (K-1))){
        test_index <- sample(1:nrow(Y), size = Ntest)
        
        testing <- Y[test_index,]
        training <- Y[-test_index, ]
        train_full_index <- sample(1:nrow(training), size = Ntrain)
        train_missing <- training[-train_full_index, ]
        train_full <- training[train_full_index, ]
      }
    }
    
    ## the true event category for the testing set is saved as a seperate variable,
    ## and deleted from the data.frame containing the data
    test_truth <- testing$event
    testing$event <- NULL
    
    ## Entries are randomly selected for deletion from the testing set.
    ## This scheme ensures a single discriminant for each observation
    ## not deleted in order to reduce computation time.
    
    abs_present <- sample(size = Ntest, 1:p, replace = TRUE)
    missing_pool <- matrix(1:p, ncol = p, nrow = Ntest, byrow = TRUE)
    
    missing_pool <- t(apply(cbind(missing_pool, abs_present), 1, function(X,pp){
      X[-c(X[pp + 1], pp +1)]
    }, pp = p))
    
    missing_pool_save <- missing_pool
    frac_missing <- (p*tst_missing)/(p-1)
    
    # sample which of the remaining elements will be missing
    missing_sample <- sample(1:(nrow(missing_pool)*ncol(missing_pool)), 
                             size = floor(nrow(missing_pool)*ncol(missing_pool)*(frac_missing)), 
                             replace = FALSE)
    
    missing_pool_save[missing_sample] <- NA
    
    saved_data <- apply(cbind(missing_pool_save, unname(abs_present)), 1, function(X){
      X[-which(is.na(X))]
    })
    
    for(j in 1:nrow(testing)){
      testing[j,-c(saved_data[[j]])] <- NA
    }
    
    ## Entries are randomly selected for deletion from the training set.
    ## This scheme ensures a single discriminant for each observation
    ## not deleted in order to reduce computation time.
    
    abs_present <- sample(size = Ntrain_missing, 1:p, replace = TRUE)
    abs_missing <- matrix(1:p, ncol = p, nrow = Ntrain_missing, byrow = TRUE)
    
    abs_missing <- t(apply(cbind(abs_missing, abs_present), 1, function(X,pp){
      X[-c(X[pp + 1], pp +1)]
    }, pp = p))
    
    abs_missing <- apply(abs_missing, 1, function(X){
      sample(X, size = 1)
    })
    
    missing_pool <- matrix(1:p, ncol = p, nrow = Ntrain_missing, byrow = TRUE)
    
    missing_pool <- t(apply(cbind(missing_pool, abs_present, abs_missing), 1, function(X,pp){
      X[-c(X[pp + 1], X[pp + 2], pp +1, pp + 2)]
    }, pp = p))
    
    missing_pool_save <- missing_pool
    frac_missing <- (p*trn_missing - 1)/(p-2)
    
    # sample which of the remaining elements will be missing
    missing_sample <- sample(1:(nrow(missing_pool)*ncol(missing_pool)), size = floor(nrow(missing_pool)*ncol(missing_pool)*(frac_missing)), replace = FALSE)
    
    
    missing_pool_save[missing_sample] <- NA
    
    saved_data <- apply(cbind(missing_pool_save, unname(abs_present)), 1, function(X){
      X[-which(is.na(X))]
    })
    
    for(j in 1:nrow(train_missing)){
      train_missing[j,-c(saved_data[[j]], p+1)] <- NA
    }
    
    return(list(Y = list(train_full = train_full, train_missing = train_missing, 
                         testing = testing, test_truth = test_truth), 
                params = list(mu = mu_use, Sig = S, vic = vic)))
    
}

The function generates data for K categories, each with a different mean and covariance structure for a single event observation, drawn from a multivariate normal distribution. The covariance of the K categories each has a random chance of being a block-covariance matrix with blocks of random sizes, or a full covariance matrix. The simulated values are then transformed from real values to $(0,1)$ using the logistic function. The random two block covariance is in effort to represent the fact that some events will have correlated observations from space and ground modalities, and others will have uncorrelated observations between the modalities with random sets of discriminants exhibiting correlations.

The arguments of data_gen are specified for the experiment.


## Data Parameters
tst_missing <- 0.5
trn_missing <- 0.5

Ntrain <- 25
Ntest <- 100
Ntrain_missing <- 5 * Ntrain

K <- 3

P is a vector of the values of p we want to examine. For the experiment in a forthcoming publication, P = c(4,6,8,10). In order to reduce the computation time of this vignette, only values of 4 and 6 will be used.

P <- c(4,6)
#P <- c(4,6,8,10)

The generated data will be used to train and test different implementations of the Event Categorization Matrix (ECM) model. The comparators are classical (C-ECM), Bayesian ECM (B-ECM) only trained on events where all discriminants are available, B-ECM where the model is trained on all available data including partial observations (M-B-ECM), M-B-ECM where the loss matrix is changed such that the false negative rate is reduced. All of these comparators focus on binary categorization, simply detecting if a new observation belongs to a pre-specified important category. The last comparator categorizes new events into each of the K event categories used for training with the B-ECM model (B-ECM Cat).

Model and Decision Specifications

All methods use typicality indices as part of the decision framework. For all methods, the significance level alphatilde is set to 0.05.

alphatilde <- 0.05

We want to specify that for the B-ECM models, the weights of the components in the mixture model are informed by the data. Alternatively, one could change mixture_weights <- "equal" for all components of the mixture model to have equal weight, possibly if the frequency of the events in the training data is unrelated to what is expected in practice.

mixture_weights <- "training"

Three separate loss matrices need to be specified for the experiment. You may be here from the becm_decision() function. The structure of a full K category loss matrix is

\[ \begin{array}{cc@{}cc} \phantom{XXX} & \phantom{XXX} & \phantom{XXX} & \mathrm{Action}\\ \phantom{XXX} & \phantom{XXX} & \phantom{XXX} & \begin{array}{cccc} a_1 \phantom{X} & a_2 \phantom{X} & \dots & a_K \end{array}\\ C = & \begin{array}{c} \mathrm{True}\\ \mathrm{Category} \end{array} \hspace{-1.5em}& \begin{array}{c} \tilde{z}_1 \\ \tilde{z}_2 \\ \vdots \\ \tilde{z}_K \end{array} \hspace{-2em}& \left[\begin{array}{c|c|c|c} C_{1,1} & C_{1,2} & \dots & C_{1, K}\\ \hline C_{2,1} & C_{2,2} & \dots & C_{2,K}\\ \hline \vdots & \vdots & \ddots & \vdots \\ \hline C_{K,1} & C_{K,2} & \dots & C_{K, K} \end{array}\right] \end{array}. \] Where the losses associated with the action of categorizing a $\tilde{y}_{\tilde{p}}$ into one of the $K$ training categories is specified in the columns, and the value of the latent categorization variable $\tilde{z}_k$ for arbitrary category index $k$ is specified in the rows. If $\tilde{Z}^{\top} = [\tilde{z}_1 \dots \tilde{z}_K]$ is known, a $1 \times K$ row vector of losses associated with each possible categorization action could be calculated as the matrix vector product $L_{1 \times K} = \tilde{Z}^{\top} C$. However, the elements of $\tilde{Z}^{\top}$ are unknown and modeled as random variables in the B-ECM framework. We can take the expectation of the loss for each action as $\mathbb{E}[L]_{1 \times K} = \mathbb{E}[\tilde{Z}^{\top}]C$. The action that provides the minimum expected loss is probably the best bet for categorization.

The above structure for the loss matrix $C$ is equivalent to what must be later provided to the becm_decision() function for full $K$ training categorization. The indices of the rows and columns of $C$ are the same order as the indices of the categories listed as names(bayes_pred$BayesECMfit$Y) for the bayes_pred argument provided to becm_decision(). The structure of $C$ differs for binary categorization differs in a slight, but important-to-note, way. For binary categorization, we want to detect if $\tilde{y}_{\tilde{p}}$ belongs to a specific category stipulated using the vic (Very Important Category) argument, and therefore the indexing of $C$ for full $K$ categorization does not port well to applications for binary categorization. In this case, the first row and column of $C$ always correspond to the category chosen as vic. The structure of $C$, for vic indexed as $k$, is therefore as is below.

\[ \begin{array}{cc@{}cc} \phantom{XX} & \phantom{XX} & \phantom{XX} & \mathrm{Action}\\ \phantom{XX} & \phantom{XX} & \phantom{XX} & \begin{array}{cc} \mathrm{a}_k & \mathrm{a}_{k^{-}} \end{array}\\ C = & \begin{array}{c} \mathrm{True}\\ \mathrm{Category} \end{array} \hspace{-0.5em}& \begin{array}{c} \tilde{z}_k \\ \sum_{\substack{i = 1 \\ i \neq k}}^K \tilde{z}_i \end{array} \hspace{-1em}& \left[\begin{array}{c|c} C_{1,1} & C_{1,2}\\ \hline C_{2,1} & C_{2,2} \end{array}\right] \end{array} \]

With the necessary structure of the loss matrices in mind, first we specify the loss matrices for B-ECM and M-B-ECM, which will utilize 0-1 loss for binary categorization. This loss structure does not reward correct categorizations, but punnishes mis-categorizations with a loss of 1. Specifying the loss matrix accordingly as the C01 variable:

C01 <- matrix(c(0,1,1, 0), ncol = 2, nrow = 2)
C01
#>      [,1] [,2]
#> [1,]    0    1
#> [2,]    1    0

To test M-B-ECM with a higher loss for false negatives, the Cfneg matrix variable is created. For loss matrix $C$ specified for binary categorization, the entry $C_{1,2}$ corresponds to the loss for choosing to categorize $\tilde{y}_{\tilde{p}}$ into the group of categories not specified by vic when the truth is $\tilde{y}_{\tilde{p}}$ is in the vic category. Such a situation is the definition of a false negative, so changing $C$ to reduce the false negative rate is straightforward. Similarly, $C_{2,1}$ could be increased relative to $C_{1,2}$ to reduce the false positive rate, but we will stick with the goal of reducing false negatives for this experiment. The loss for false negatives is increased by setting Cfneg[1,2] <- 2.

Cfneg <- C01
Cfneg[1,2] <- 2
Cfneg
#>      [,1] [,2]
#> [1,]    0    2
#> [2,]    1    0

Because K = 3, a $3 \times 3$ loss matrix needs to be specified for M-B-ECM Cat. The loss for any mis-categorization is specified to be equal for each possibility. With all non-zero values equal, using a value of 1 is sufficient.

Ccat <- matrix(1, ncol = 3, nrow = 3) - diag(3)
Ccat
#>      [,1] [,2] [,3]
#> [1,]    0    1    1
#> [2,]    1    0    1
#> [3,]    1    1    0

Monte Carlo Parameters

The experiment is a Monte Carlo experiment. Random data sets are repeatedly generated. We are interested in examining the typical behavior of the models, as well as the variation in behavior. Each model is fit to each data set and tested using a seperate testing set. The accuracy, false negative rate, and false positive rate for all model implementations are recorded for each data set generated within each Monte Carlo iteration. To reduce the computation time of this vignette, only 3 Monte Carlo iterations are generated for each total discriminant size specified in P. To replicate the figure and table generated for a forthcoming publication, instead set iters <- 250.

iters <- 3#250

Each model that can handle missing data utilizes Markov chain Monte Carlo (MCMC) to impute possible values of the missing entries within the training data. Then these values are integrated out when evaluating the expected loss for each action. MCMC occurs multiple times within each Monte Carlo iteration of the experiment, both concepts are not intertwined here. Three parameter values need to be set for MCMC. The two element vector BT first specifies the number of Burn-in random samples of the missing data values that are discgarded under the assumption that the Markov chain has not converged within the first BT[1] draws. BT[2] is the total number of MCMC iterations. After training models that can handle missing data, the total number of draws from the distribution of missing data entries will be BT[1] - BT[2] for each missing element. To reduce computation time, the values of BT have been reduced. The values BT < c(500,50500) were used to compare the models in a forthcoming publication.

## MCMC parameters
BT <- c(150, 2150)
#BT <- c(500, 50500)

The predict.BayesECM() function can use all of the draws of the missing data values obtained, or thin the samples. Specifying thinning <- 5 means every fifth sample will be used, which reduces the computation time for prediction as well as the autocorrelation between draws. The default, thinning = 1, utilizes all of the samples.

thinning <- 5

The Experiment

We will specify a few more variables. If you are running a version of this experiment that is not fast to compute, we suggest setting verb <- TRUE. To make this document, we set

## Experimental Parameters

verb <- FALSE
#verb <- TRUE

To save the performance metrics for each Monte Carlo iteration, a list exp_out is defined which saves the number of accurate categorizations, the false positive rate, and the false negative rate at each iteration for each value of p. General data, important for making calculations later, is also saved to exp_out. The structure of exp_out is built as the experiment moves through the different values in P. If verb == TRUE the time at the start of the experiment is saved.


## Data structures for saving progress

cECM_recfp <- cECM_recfn <- bayes_rec <- cECM_rec <- matrix(NA, ncol = length(P), nrow = iters)
Nvic <- rep(0, times = length(P))

exp_out <- list()
method_template <- data.frame(matrix(NA, ncol = 3, nrow = iters))
names(method_template) <- c("accurate", "FN", "FP")
data_template <- data.frame(matrix(NA, ncol = 2, nrow = iters))
names(data_template) <- c("Ntest", "Nvic")
data_template$Ntest <- Ntest

p_template <- list(cECM = method_template, becm = method_template, 
                   mbecm = method_template, mbecm_Cfn = method_template, 
                   mbecm_cat = method_template, data = data_template)

bayes_rec <- cECM_rec <- matrix(NA, ncol = length(P), nrow = iters)

if(verb){
toc <- Sys.time()
}

Then, the experiment iterates over the values of P and then the number of Monte Carlo iterations for each setting of p. Because the experiment is in a for loop, detailed descriptions are in the comments.

## Iterates over the differing numbers of total discriminants set for the experiment.
for(p in P){
  
  ## Builds the list for saving the results using a template list
  exp_out[[p]] <- p_template
  
  ## Runs each model for `iters` number of independent data sets.
  for(i in 1:iters){
    ## The i^{th} run for p discriminants
    if(verb){    
      ## set the experimental parameter verb <- TRUE to print progress
      print(paste0("i = ", i, ", p = ", p, ", ", round(Sys.time() - toc, digits = 2), " ", units(Sys.time() - toc), " elapsed"))
    }
    
    ## Generate random data set
    
      Ylist <- data_gen(p = p, K = K, Ntest = Ntest, Ntrain = Ntrain, 
                     Ntrain_missing = Ntrain_missing, tst_missing = tst_missing, 
                    trn_missing = trn_missing)
      
      ## Saves the random data information to the environment
      
      train_full<- Ylist$Y$train_full
      train_missing <- Ylist$Y$train_missing
      testing <- Ylist$Y$testing
      test_truth <- Ylist$Y$test_truth
      ## Which category is the important one this time?
      vic <- Ylist$params$vic
    
    
    ## Save the true total number of `vic` events in the testing data to be used
    ## later for analyzing performance.
    
    exp_out[[p]][["data"]]$Nvic[i] <- sum(test_truth == as.character(vic))
    
    ## Fit the classical ECM model, apply the decision framework with the
    ## `cECM_decision()` function, then save the results
    
    cECM <- cECM(x = train_full, transform = TRUE, newdata = testing)

    cECM_out <-  apply(cECM_decision(pval = cECM, alphatilde = alphatilde,
                                 vic = as.character(vic), 
                                 cat_truth = test_truth)$events[,1:3] ,2, 
                       sum, na.rm = TRUE)

    exp_out[[p]][["cECM"]][i,] <- unname(cECM_out)
    
    ## Fit the B-ECM model, using only full p observations
    
    bayes_fit <- BayesECM(Y = train_full)
    
    ## Run the predict function on the testing set.
    ## If there were multiple testing sets, the same model fit could be used on
    ## each one without having to rerun the `BayesECM()` function.  This
    ## functionality is more important when using training data with missing
    ## entries.
    
    bayes_pred <- predict(bayes_fit, Ytilde = testing, 
                          mixture_weights = mixture_weights)
    
    ## The "becm_desision()" function applies the decision theoretic framework
    ## to the training and testing data.  For one training and one testing set,
    ## where the user wants to try different values of `alphatilde` and `C`, it is
    ## not necessary to rerun the `BayesECM()` function or the `predict()`
    ## function.
    
    becm_out <- becm_decision(bayes_pred = bayes_pred, alphatilde = alphatilde,
                                vic = as.character(vic), cat_truth = test_truth, 
                              pn = TRUE, C = C01)
    
    ## Summarize and save the data.
    
    becm_out <- apply(becm_out$results,2, sum, na.rm = TRUE)
    
    exp_out[[p]][["becm"]][i,] <- unname(becm_out)
    
    
    ## Fit and save the B-ECM model that includes missing data
    
    bayes_fit_missing <- BayesECM(Y = rbind(train_full, train_missing), BT = BT, 
                                  verb = verb)
    
    bayes_pred_missing <- predict(bayes_fit_missing, Ytilde = testing, 
                                  thinning = thinning, 
                                  mixture_weights = mixture_weights)
    
    missing_out <- becm_decision(bayes_pred = bayes_pred_missing, alphatilde = alphatilde,
                             vic = as.character(vic), cat_truth = test_truth, 
                             pn = TRUE, C = C01)
    mbecm_out <- apply(missing_out$results,2, sum, na.rm = TRUE)
    
    
    exp_out[[p]][["mbecm"]][i,] <- unname(mbecm_out)
    
    ## The rest of the B-ECM variants are different through decision theory,
    ## not the model fit.  All use partial observations for training.
    ## Note that the rej argument is supplied to becm_decision to reduce computation time
    
    ## Record the decision when the loss matrix is adjusted to target
    ## false negatives.
    
    Cfn_out <- becm_decision(bayes_pred = bayes_pred_missing, alphatilde = alphatilde,
                         vic = as.character(vic), cat_truth = test_truth, 
                         pn = TRUE, C = Cfneg, rej = missing_out$rej)
    becm_Cfn_out <- apply(Cfn_out$results,2, sum, na.rm = TRUE)
    
    
    exp_out[[p]][["mbecm_Cfn"]][i,] <- unname(becm_Cfn_out)
    
    ## Record the decisions when full class (K = 3) categorization is used
    ## instead of binary categorization
    
    cat_out <-  becm_decision(bayes_pred = bayes_pred_missing, alphatilde = alphatilde,
                          vic = as.character(vic), cat_truth = test_truth, 
                          pn = TRUE, C = Ccat, rej = missing_out$rej)
    becm_cat_out <- apply(cat_out$results,2, sum, na.rm = TRUE)
    
    
    exp_out[[p]][["mbecm_cat"]][i,] <- unname(becm_cat_out)

    
  }
  
}

Plotting the Results

First a function for making the boxplot is defined. The output from the experiment is the first argument. The user can subset the number of discriminant compared with the P argument, and models compared with the models argument. A different color palette can be supplied if desired using cols. The legend_text can also be altered, and should be if the user does not want to plot all of the models compared.


ECM_boxplot <- function(exp_out, P = P, models = c("cECM", "becm", 
                                                   "mbecm", 
                                                   "mbecm_Cfn",
                                                   "mbecm_cat"),
                        cols = NULL,
                        legend_text =  c("C-ECM", "B-ECM", "M-B-ECM", 
                  bquote("M-B-ECM, " * C["1,2"] == 2), "M-B-ECM Cat"),
                  metric = "accurate"){
  
  if(metric == "accurate"){
    divisor <- function(exp_out, p){
      return(exp_out[[p]]$data$Ntest)
    }
    ylab <- "Model Accuracy"
  }else if(metric == "FN"){
    divisor <- function(exp_out,p){
      return(exp_out[[p]]$data$Nvic)
    }
    ylab <- "False Negative Rate"
  }else if(metric == "FP"){
    divisor <- function(exp_out, p){
      return(exp_out[[p]]$data$Ntest - exp_out[[p]]$data$Nvic)
    }
    ylab <- "False Positive Rate"
  }else{
    stop("Argument 'metric' must be one of the following case sensitive character strings: 'accurate', 'FN', or 'FP'.")
  }
   boxplotdf <- do.call("cbind", lapply(exp_out[[P[1]]][models], function(X, m = metric){
  X[[m]]
}))/divisor(exp_out, p = P[1])
  
   
  for(p in P[-1]){
  boxplotdf <- cbind(boxplotdf,do.call("cbind", lapply(exp_out[[p]][models], function(X, m = metric){
  X[[m]]
}))/divisor(exp_out, p = p))
  }
  
   boxplotdf <- boxplotdf
   
   if(max(boxplotdf) > 65){
     ylim <- c(min(boxplotdf), 1.1)
   }else{
    ylim <- range(boxplotdf) * c(0.9,1.2)
   }
   

if(is.null(cols)){
if(length(models) == 5){
pltcols <- hcl.colors(44, palette = "viridis", rev = TRUE)[c(3, 10, 20, 30, 37)]
}else if(length(models) > 10){
  warning("You should consider recoding this function with a different way to select the colors used for the plot.")
  pltcols <- hcl.colors(44, palette = "viridis", rev = TRUE)[seq(from = 2, to = 38, length.out = length(models))]
}else{
  pltcols <- hcl.colors(44, palette = "viridis", rev = TRUE)[seq(from = 2, to = 38, length.out = length(models))]
}
}else{
  if(length(cols) != length(models)){
    stop("If supplying colors, a vector the same length as the 'models' argument must be used.")
  }
}
   
opar <- par(no.readonly = TRUE)
on.exit(expr = suppressWarnings(par(opar)))
par(mar = c(4.25,3.85,1,0.5))
lmodels <- length(models)
graphics::boxplot(boxplotdf,  
                  at = (1:((lmodels + 1)*length(P)))[-seq(from = lmodels + 1, 
                                                          to = length(P)*(lmodels + 1), 
                                                          by = lmodels + 1)],
         xaxt = "n", yaxt = "n", ylim = ylim, 
        col = pltcols, xlab = "Number of Discriminants", ylab = ylab)
py <- pretty(boxplotdf)
graphics::axis(2, py)
graphics::axis(1, at =seq(from = 1, to = length(P)*(lmodels + 1), by = lmodels + 1) + 1, 
               labels =  paste0("p = ", P) )
graphics::legend("topleft", bty ="n", 
       legend = legend_text,  
       fill = pltcols, horiz = FALSE, ncol = 3, y.intersp = 1.25)
}

Then, using the function with all of the data on overall accuracy collected.

ECM_boxplot(exp_out = exp_out, P = P, metric = "accurate")

If it is desirable to instead plot false negatives or false positives, the argument metric can be set to "FN" and "FP" respectively.

ECM_boxplot(exp_out = exp_out, P = P, metric = "FN")

It is likely that without any changes to the code the plots above can look quite different. Using the commented out values for the variables P, iters, and BT will help, but the computation time for the full experiment is a few days. Alternatively, we have a hunch the settings iters <- 50, BT <- c(500, 20500), and thinning <- 2 to provide a good compromise between computation time and Monte Carlo variance. Additionally, larger values in the vector P have a longer computation time, so the larger values can be removed or added as seen fit.