Description
MachineShop is a meta-package for statistical and
machine learning with a unified interface for model fitting, prediction,
performance assessment, and presentation of results. Support is provided
for predictive modeling of numerical, categorical, and censored
time-to-event outcomes and for resample (bootstrap, cross-validation,
and split training-test sets) estimation of model performance. This
vignette introduces the package interface with a survival data analysis
example, followed by supported methods of variable specification;
applications to other response variable types; available performance
metrics, resampling techniques, and graphical and tabular summaries; and
modeling strategies.
Features
- Unified and concise interface for model fitting, prediction, and
performance assessment.
- Support for 53+ models from 28 R packages,
including model specifications from the parsnip
package.
- Dynamic model parameters.
- Ensemble modeling with stacked regression and super learners.
- Modeling of response variables types: binary factors, multi-class
nominal and ordinal factors, numeric vectors and matrices, and censored
time-to-event survival.
- Model specification with traditional formulas, design matrices, and
flexible pre-processing recipes.
- Resample estimation of predictive performance, including
cross-validation, bootstrap resampling, and split training-test set
validation.
- Parallel execution of resampling algorithms.
- Choices of performance metrics: accuracy, areas under ROC and
precision recall curves, Brier score, coefficient of determination
(R2), concordance index, cross entropy, F score, Gini
coefficient, unweighted and weighted Cohen’s kappa, mean absolute error,
mean squared error, mean squared log error, positive and negative
predictive values, precision and recall, and sensitivity and
specificity.
- Graphical and tabular performance summaries: calibration curves,
confusion matrices, partial dependence plots, performance curves, lift
curves, and model-specific and permutation-based variable
importance.
- Model tuning over automatically generated grids and with exhaustive
and random grid searches, Bayesian optimization, particle swarm
optimization, quasi-Newton BFGS optimization, simulated annealing, and
support for user-defined optimization functions.
- Model selection and comparisons for any combination of models and
model parameter values.
- Recursive feature elimination.
- User-definable models and performance metrics.
Getting Started
Installation
# Current release from CRAN
install.packages("MachineShop")
# Development version from GitHub
# install.packages("devtools")
devtools::install_github("brian-j-smith/MachineShop")
# Development version with vignettes
devtools::install_github("brian-j-smith/MachineShop", build_vignettes = TRUE)
Documentation
Once installed, the following R commands will load
the package and display its help system documentation. Online
documentation and examples are available at the MachineShop
website.
library(MachineShop)
# Package help summary
?MachineShop
# Vignette
RShowDoc("UserGuide", package = "MachineShop")