Getting started with the ebnm package
The empirical Bayes normal means problem
Given \(n\) observations \(x_i\) with known standard deviations \(s_i\), \(i = 1,
\dots, n\), the normal means model (Robbins 1951; Efron and Morris 1972; Stephens 2017;
Bhadra et al. 2019; Johnstone 2019; Sun 2020) is \[\begin{equation}
x_i \overset{\text{ind.}}{\sim} \mathcal{N}(\theta_i, s_i^2),
\end{equation}\] with the unknown (true) means \(\theta_i\) to be estimated. Here and
throughout, we use \(\mathcal{N}(\mu,
\sigma^2)\) to denote the normal distribution with mean \(\mu\) and variance \(\sigma^2\). The maximum-likelihood estimate
of \(\theta_i\) is, of course, \(x_i\). The empirical Bayes (EB) approach to
inferring \(\theta_i\) attempts to
improve upon the maximum-likelihood estimate by “borrowing information”
across observations, exploiting the fact that each observation contains
information not only about its respective mean, but also about how the
means are collectively distributed (Robbins 1956;
Morris 1983; Efron 2010; Stephens 2017). Specifically, the EB
approach assumes that \[\begin{equation}
\theta_i \overset{\text{ind.}}{\sim} g \in \mathcal{G},
\end{equation}\] where \(g\) is
a distribution to be estimated from the data, typically chosen from
among some family of distributions \(\mathcal{G}\) that is specified in advance.
(Note that although \(\mathcal{G}\)
must be specified in advance, it can be arbitrarily flexible.)
The empirical Bayes normal means model is therefore fit by first
using all of the observations to estimate \(g
\in \mathcal{G}\), then using the estimated distribution \(\hat{g}\) to compute posteriors for each
mean \(\theta_i\). Commonly, \(g\) is estimated via maximum-likelihood, in
which case the EB approach consists of the following:
Find \(\hat{g} := \rm{argmax}_{g
\,\in\, \mathcal{G}} L(g)\), where \(L(g)\) denotes the marginal likelihood,
\[\begin{equation}
L(g) := p({\mathbf x} \mid g, {\mathbf s}) =
\prod_{i=1}^n \textstyle
\int p(x_i \mid \theta_i, s_i) \, g(\theta_i) \, d\theta_i,
\end{equation}\] and we define \({\mathbf x} := (x_1, \ldots, x_n)\), \({\mathbf s} := (s_1, \ldots,
s_n)\).
Compute posterior distributions \[\begin{equation}
p(\theta_i \mid x_i, s_i, \hat{g}) \propto
\hat{g}(\theta_i) \, p(x_i \mid \theta_i, s_i),
\end{equation}\] and/or summaries (posterior means, variances,
etc.).
We refer to this two-step process as “solving the EBNM problem.” The
ebnm
package provides a unified interface for efficiently
solving the EBNM problem using a wide variety of prior families \(\mathcal{G}\). For some prior families, it
leverages code from existing packages; for others, it implements new
model fitting algorithms with a mind toward speed and robustness. In
this vignette, we demonstrate usage of the ebnm
package in
an analysis of weighted on-base averages for Major League Baseball
players.
Application: Weighted on-base averages
A longstanding tradition in empirical Bayes research is to include an
analysis of batting averages using data from Major League Baseball (see, for example, Brown 2008; Jiang and Zhang 2010;
and Gu and Koenker 2017). Until recently, batting averages were
the most important measurement of a hitter’s performance, with the
prestigious yearly “batting title” going to the hitter with the highest
average. However, with the rise of baseball analytics, metrics that
better correlate to teams’ overall run production have become
increasingly preferred. One such metric is wOBA (“weighted on-base
average”), which is both an excellent measure of a hitter’s offensive
production and, unlike competing metrics such as MLB’s xwOBA (Sharpe 2019) or Baseball Prospectus’s DRC+
(Judge 2019), can be calculated using
publicly available data and methods.
Initially proposed by Tango, Lichtman, and
Dolphin (2006), wOBA assigns values (“weights”) to hitting
outcomes according to how much the outcome contributes on average to run
production. For example, while batting average treats singles
identically to home runs, wOBA gives a hitter more than twice as much
credit for a home run.
Given a vector of wOBA weights \(\mathbf{w}\), hitter \(i\)’s wOBA is the weighted average \[\begin{equation}
x_i := \mathbf{w}^T \mathbf{z}^{(i)} / n_i,
\end{equation}\] where \(\mathbf{z}^{(i)} = (z_1^{(i)}, \ldots,
z_7^{(i)})\) tallies outcomes (singles, doubles, triples, home
runs, walks, hit-by-pitches, and outs) over the hitter’s \(n_i\) plate appearances (PAs). Modeling
hitting outcomes as i.i.d. \[\begin{equation}
\mathbf{z}^{(i)} \sim \text{Multinomial}(n_i, {\boldsymbol\pi}^{(i)}),
\end{equation}\] where \({\boldsymbol\pi}^{(i)} = (\pi_1^{(i)}, \ldots,
\pi_7^{(i)})\) is the vector of “true” outcome probabilities for
hitter \(i\), we can regard \(x_i\) as a point estimate for the hitter’s
“true wOBA skill” \[\begin{equation}
\theta_i := \mathbf{w}^T {\boldsymbol\pi}^{(i)}.
\end{equation}\] Standard errors for the \(x_i\)’s can be estimated as \[\begin{equation}
s_i^2 = \mathbf{w}^T \hat{\mathbf\Sigma}^{(i)} \mathbf{w}/n_i,
\end{equation}\] where \(\hat{\mathbf\Sigma}^{(i)}\) is the estimate
of the multinomial covariance matrix obtained by setting \({\boldsymbol\pi} = \hat{\boldsymbol\pi}\),
where \[\begin{equation}
\hat{\boldsymbol\pi}^{(i)} = \mathbf{z}^{(i)}/n_i.
\end{equation}\] (To deal with small sample sizes, we
conservatively lower bound each standard error by the standard error
that would be obtained by plugging in league-average event probabilities
\(\hat{\boldsymbol\pi}_{\mathrm{lg}} =
\sum_{i=1}^N \mathbf{z}^{(i)}/ \sum_{i=1}^N n_i\), where \(N\) is the number of hitters in the
dataset.)
The relative complexity of wOBA makes it well suited for analysis via
ebnm
. With batting average, a common approach is to obtain
empirical Bayes estimates using a beta-binomial model (see, for example,
Robinson (2017)). With wOBA, one can
estimate hitting outcome probabilities by way of a Dirichlet-multinomial
model; alternatively, one can approximate the likelihood as normal and
fit an EBNM model directly to the observed wOBAs. In the following, we
take the latter approach.
The wOBA
data set
We begin by loading and inspecting the wOBA
data set,
which consists of wOBAs and standard errors for the 2022 MLB regular
season:
library(ebnm)
data(wOBA)
nrow(wOBA)
#> [1] 688
head(wOBA)
#> FanGraphsID Name Team PA x s
#> 1 19952 Khalil Lee NYM 2 1.036 0.733
#> 2 16953 Chadwick Tromp ATL 4 0.852 0.258
#> 3 19608 Otto Lopez TOR 10 0.599 0.162
#> 4 24770 James Outman LAD 16 0.584 0.151
#> 5 8090 Matt Carpenter NYY 154 0.472 0.054
#> 6 15640 Aaron Judge NYY 696 0.458 0.024
Column “x” contains each player’s wOBA for the 2022 season, which we
regard as an estimate of the player’s “true” wOBA skill. Column “s”
provides standard errors.
Next, we visualize the overall distribution of wOBAs:
library(ggplot2)
ggplot(wOBA, aes(x = x)) +
geom_histogram(bins = 64, color = "white",fill = "black") +
theme_classic()
As the histogram shows, most players finished the season with a wOBA
between .200 and .400. A few had very high wOBAs (\(>\).500), while others had wOBAs at or
near zero. A casual inspection of the data suggests that players with
these very high (or very low) wOBAs were simply lucky (or unlucky). For
example, the 4 players with the highest wOBAs each had 16 PAs or fewer.
It is very unlikely that they would have sustained this high level of
production over a full season’s worth of PAs.
In contrast, Aaron Judge’s production — which included a
record-breaking number of home runs — appears to be “real,” since it was
sustained over nearly 700 PAs. Other cases are more ambiguous: how, for
example, are we to assess Matt Carpenter, who had several exceptional
seasons between 2013 and 2018 but whose output steeply declined in
2019–2021 before his surprising “comeback” in 2022? An empirical Bayes
analysis can help to answer this and other questions.
Main ebnm
methods and the normal prior family
Function ebnm()
is the main interface for fitting the
empirical Bayes normal means model; it is a “Swiss army knife” that
allows for various choices of prior family \(\mathcal{G}\) as well as providing multiple
options for fitting and tuning models. For example, we can fit a normal
means model with the prior family \(\mathcal{G}\) taken to be the family of
normal distributions:
x <- wOBA$x
s <- wOBA$s
names(x) <- wOBA$Name
names(s) <- wOBA$Name
fit_normal <- ebnm(x, s, prior_family = "normal", mode = "estimate")
(The default behavior is to fix the prior mode at zero. Since we
certainly do not expect the distribution of true wOBA skill to have a
mode at zero, we set mode = "estimate"
.)
We note in passing that the ebnm
package has a second
model-fitting interface, in which each prior family gets its own
function:
fit_normal <- ebnm_normal(x, s, mode = "estimate")
Textual and graphical overviews of results can be obtained using,
respectively, methods summary()
and plot()
.
The summary method appears as follows:
summary(fit_normal)
#>
#> Call:
#> ebnm_normal(x = x, s = s, mode = "estimate")
#>
#> EBNM model was fitted to 688 observations with _heteroskedastic_ standard errors.
#>
#> The fitted prior belongs to the _normal_ prior family.
#>
#> 2 degrees of freedom were used to estimate the model.
#> The log likelihood is 989.64.
#>
#> Available posterior summaries: _mean_, _sd_.
#> Use method fitted() to access available summaries.
#>
#> A posterior sampler is _not_ available.
#> One can be added via function ebnm_add_sampler().
The plot()
method visualizes results, comparing the
“observed” values \(x_i\) (the initial
wOBA estimates) against the empirical Bayes posterior mean estimates
\(\hat{\theta}_i\):
The dashed line shows the diagonal \(x =
y\), which makes shrinkage effects clearly visible. In
particular, the most extreme wOBAs on either end of the spectrum are
strongly shrunk towards the league average (around .300).
Since plot()
returns a “ggplot” object (Wickham 2016), the plot can conveniently be
customized using ggplot2
syntax. For example, one can vary
the color of the points by the number of plate appearances:
plot(fit_normal) +
geom_point(aes(color = sqrt(wOBA$PA))) +
labs(x = "wOBA", y = "EB estimate of true wOBA skill",
color = expression(sqrt(PA))) +
scale_color_gradient(low = "blue", high = "red")
By varying the color of points, we see that the wOBA estimates with
higher standard errors or fewer plate appearances (blue points) tend to
be shrunk toward the league average much more strongly than wOBAs from
hitters with many plate appearances (red points).
Above, we used head()
to view data for the first 6
hitters in the dataset. Let’s now see what the EBNM analysis suggests
might be their “true” wOBA skill. To examine the results more closely,
we use the fitted()
method, which returns a posterior
summary for each hitter:
print(head(fitted(fit_normal)), digits = 3)
#> mean sd
#> Khalil Lee 0.303 0.0287
#> Chadwick Tromp 0.308 0.0286
#> Otto Lopez 0.310 0.0283
#> James Outman 0.311 0.0282
#> Matt Carpenter 0.339 0.0254
#> Aaron Judge 0.394 0.0184
The wOBA estimates of the first four ballplayers are shrunk strongly
toward the league average, reflecting the fact that these players had
very few plate appearances (and indeed, we were not swayed by their very
high initial wOBA estimates).
Carpenter had many more plate appearances (154) than these other four
players, but according to this model we should remain skeptical about
his strong performance; after factoring in the prior, we judge his
“true” performance to be much closer to the league average, downgrading
an initial estimate of .472 to the final posterior mean estimate of
.339.
Comparing normal and unimodal prior families
Judge’s “true” wOBA is also estimated to be much lower (.394) than
the initial estimate (.458) despite sustaining a high level of
production over a full season (696 PAs). For this reason, one might ask
whether a prior that is more flexible than the normal prior—that is, a
prior that can better adapt to “outliers” like Judge—might produce a
different result. The ebnm
package is very well suited to
answering this question. For example, to obtain results using the family
of all unimodal distributions rather than the family of normal
distributions, we only need to change prior_family
from
"normal"
to "unimodal"
:
fit_unimodal <- ebnm(x, s, prior_family = "unimodal", mode = "estimate")
It is straightforward to produce a side-by-side visualization of the
fitted models simply by including both models as arguments to the
plot()
method (we also use the subset
argument
to focus on the results for Judge and other players with the most plate
appearances):
top50 <- order(wOBA$PA, decreasing = TRUE)
top50 <- top50[1:50]
plot(fit_normal, fit_unimodal, subset = top50)
This plot illustrates the ability of the unimodal prior to better
adapt to the data: wOBA estimates for players with a lot of plate
appearances are not adjusted quite so strongly toward the league
average. To compare in more detail, we see for example that Judge’s wOBA
estimate from the model with the unimodal prior (the
"mean2"
column) remains much closer to the original wOBA
estimate:
dat <- cbind(wOBA[, c("PA","x")],
fitted(fit_normal),
fitted(fit_unimodal))
names(dat) <- c("PA", "x", "mean1", "sd1", "mean2", "sd2")
print(head(dat), digits = 3)
#> PA x mean1 sd1 mean2 sd2
#> Khalil Lee 2 1.036 0.303 0.0287 0.302 0.0277
#> Chadwick Tromp 4 0.852 0.308 0.0286 0.307 0.0306
#> Otto Lopez 10 0.599 0.310 0.0283 0.310 0.0315
#> James Outman 16 0.584 0.311 0.0282 0.311 0.0318
#> Matt Carpenter 154 0.472 0.339 0.0254 0.355 0.0430
#> Aaron Judge 696 0.458 0.394 0.0184 0.439 0.0155
Carpenter’s wOBA estimate is also higher under the more flexible
unimodal prior, but is still adjusted much more than Judge’s in light of
Carpenter’s smaller sample size. It is also interesting that the
unimodal prior assigns greater uncertainty (the "sd2"
column) to this estimate compared to the normal prior.
Recall that the two normal means models differ only in the priors
used, so we can understand the differences in the shrinkage behavior of
these models by inspecting the priors. Calling plot()
with
incl_cdf = TRUE
shows the cumulative distribution functions
(CDFs) of the fitted priors \(\hat{g}\). Since we are particularly
interested in understanding the differences in shrinkage behaviour for
the largest wOBAs such as Judge’s, we create a second plot that zooms in
on wOBAs over .350:
library(cowplot)
p1 <- plot(fit_normal, fit_unimodal, incl_cdf = TRUE, incl_pm = FALSE) +
xlim(c(.250, .350)) +
guides(color = "none")
p2 <- plot(fit_normal, fit_unimodal, incl_cdf = TRUE, incl_pm = FALSE) +
lims(x = c(.350, .450), y = c(0.95, 1))
plot_grid(p1, p2, nrow = 1, ncol = 2, rel_widths = c(3,5))
The plot on the right shows that the fitted normal prior has almost
no mass on wOBAs above .400, explaining why Judge’s wOBA estimate is
shrunk so strongly toward the league average, whereas the unimodal prior
is flexible enough to permit larger posterior estimates above .400.
The posterior means and standard errors returned from the
ebnm()
call cannot be used to obtain credible intervals
(except for the special case of the normal prior). Therefore, we provide
additional methods confint()
and quantile()
which return, respectively, credible intervals (or more precisely,
highest posterior density intervals: E.
P. Box and Tiao (1965), Chen and Shao
(1999)) and posterior quantiles for each observation. These are
implemented using Monte Carlo techniques, which can be slow for large
data sets, so credible intervals are not computed by default. The
following code computes 80% highest posterior density (HPD) intervals
for the EBNM model with unimodal prior. (We add a Monte Carlo sampler
using function ebnm_add_sampler()
; alternatively, we could
have added a sampler in our initial calls to ebnm()
by
specifying output = output_all()
.) We set a seed for
reproducibility:
fit_unimodal <- ebnm_add_sampler(fit_unimodal)
set.seed(1)
print(head(confint(fit_unimodal, level = 0.8)), digits = 3)
#> CI.lower CI.upper
#> Khalil Lee 0.277 0.328
#> Chadwick Tromp 0.277 0.334
#> Otto Lopez 0.277 0.336
#> James Outman 0.277 0.335
#> Matt Carpenter 0.277 0.389
#> Aaron Judge 0.428 0.458
Interestingly, the 80% credible interval for Carpenter is very wide,
and shares the same lower bound as the first four ballplayers with very
few plate appearances.
Nonparametric prior families and advanced usage
Above, we demonstrated how the ebnm
package makes it is
easy to perform EBNM analyses with different types of priors, then
compared results across two different choices of prior family. Each of
these families makes different assumptions about the data which, a
priori, may be more or less plausible. An alternative to prior
families that make specific assumptions about the data is to use the
prior family that contains all distributions \(\mathcal{G}_{\mathrm{npmle}}\), which is in
a sense “assumption free.” Here we re-analyze the wOBA data set to
illustrate the use of this prior family. Note that although
nonparametric priors require specialized computational techniques,
switching to a nonparametric prior is seamless in ebnm
, as
these implementation details are hidden. Similar to above, we need only
make a single change to the prior_family
argument:
fit_npmle <- ebnm(x, s, prior_family = "npmle")
(Note that because the family \(\mathcal{G}_{\mathrm{npmle}}\) is not
unimodal, the mode = "estimate"
option is not relevant
here.)
Although the implementation details are hidden by default, it can
sometimes be helpful to see what is going on “behind the scenes,”
particularly for flagging or diagnosing issues. By default,
ebnm
uses the mixsqp
package (Kim et al. 2020) to fit the NPMLE \(\hat{g} \in \mathcal{G}_{\mathrm{npmle}}\).
We can monitor convergence of the mix-SQP optimization algorithm by
setting the verbose
control argument to
TRUE
:
fit_npmle <- ebnm(x, s, prior_family = "npmle",
control = list(verbose = TRUE))
#> Running mix-SQP algorithm 0.3-48 on 688 x 95 matrix
#> convergence tol. (SQP): 1.0e-08
#> conv. tol. (active-set): 1.0e-10
#> zero threshold (solution): 1.0e-08
#> zero thresh. (search dir.): 1.0e-14
#> l.s. sufficient decrease: 1.0e-02
#> step size reduction factor: 7.5e-01
#> minimum step size: 1.0e-08
#> max. iter (SQP): 1000
#> max. iter (active-set): 20
#> number of EM iterations: 10
#> Computing SVD of 688 x 95 matrix.
#> Matrix is not low-rank; falling back to full matrix.
#> iter objective max(rdual) nnz stepsize max.diff nqp nls
#> 1 +9.583407733e-01 -- EM -- 95 1.00e+00 6.08e-02 -- --
#> 2 +8.298700300e-01 -- EM -- 95 1.00e+00 2.87e-02 -- --
#> 3 +7.955308369e-01 -- EM -- 95 1.00e+00 1.60e-02 -- --
#> 4 +7.819858634e-01 -- EM -- 68 1.00e+00 1.05e-02 -- --
#> 5 +7.753787534e-01 -- EM -- 53 1.00e+00 7.57e-03 -- --
#> 6 +7.717040208e-01 -- EM -- 49 1.00e+00 5.73e-03 -- --
#> 7 +7.694760705e-01 -- EM -- 47 1.00e+00 4.48e-03 -- --
#> 8 +7.680398878e-01 -- EM -- 47 1.00e+00 3.58e-03 -- --
#> 9 +7.670690681e-01 -- EM -- 44 1.00e+00 2.91e-03 -- --
#> 10 +7.663865515e-01 -- EM -- 42 1.00e+00 2.40e-03 -- --
#> 1 +7.658902386e-01 +6.493e-02 39 ------ ------ -- --
#> 2 +7.655151741e-01 +5.328e-02 19 1.00e+00 1.29e-01 20 1
#> 3 +7.627792563e-01 +6.432e-03 7 1.00e+00 1.74e-01 20 1
#> 4 +7.626270812e-01 +5.897e-05 7 1.00e+00 3.23e-01 7 1
#> 5 +7.626270755e-01 -1.335e-08 7 1.00e+00 4.67e-04 2 1
#> Optimization took 0.03 seconds.
#> Convergence criteria met---optimal solution found.
This output shows no issues with convergence of the optimization
algorithm; the mix-SQP algorithm converged to the solution (up to
numerical rounding error) in only six iterations. In some cases,
convergence issues can arise when fitting nonparametric models to large
or complex data sets, and revealing the details of the optimization can
help to pinpoint these issues.
Next, we visually compare the three fits obtained so far:
plot(fit_normal, fit_unimodal, fit_npmle, incl_cdf = TRUE, subset = top50)
As before, estimates largely agree, differing primarily at the tails.
Both the unimodal prior family and the NPMLE are sufficiently flexible
to avoid the strong shrinkage behavior of the normal prior family.
Fits can be compared quantitatively using the logLik()
method, which, in addition to the log likelihood for each model,
usefully reports the number of free parameters (i.e., degrees of
freedom):
logLik(fit_unimodal)
#> 'log Lik.' 992.6578 (df=40)
logLik(fit_npmle)
#> 'log Lik.' 994.193 (df=94)
A nonparametric prior \(\mathcal{G}\) is approximated by \(K\) mixture components on a fixed grid,
with the mixture proportions to be estimated. We can infer from the
above output that \(\mathcal{G}_\text{npmle}\) has been
approximated as a family of mixtures over a grid of \(K = 95\) point masses spanning the range of
the data. (The number of degrees of freedom is one fewer than \(K\) because the mixture proportions must
always sum to 1, which removes one degree of freedom from the estimation
of \({\boldsymbol\pi}\).)
The default behaviour for nonparametric prior families is to choose
\(K\) such that the likelihood obtained
using estimate \(\hat{g}\) should be
(on average) within one log-likelihood unit of the optimal estimate from
among the entire nonparametric family \(\mathcal{G}\) (see
Willwerscheid 2021). Thus, a finer approximating grid should not
yield a large improvement in the log-likelihood. We can check this by
using ebnm_scale_npmle()
to create a finer grid:
scale_npmle <- ebnm_scale_npmle(x, s, KLdiv_target = 0.001/length(x),
max_K = 1000)
fit_npmle_finer <- ebnm_npmle(x, s, scale = scale_npmle)
logLik(fit_npmle)
#> 'log Lik.' 994.193 (df=94)
logLik(fit_npmle_finer)
#> 'log Lik.' 994.2703 (df=528)
As the theory predicts, a much finer grid, with \(K = 529\), results in only a modest
improvement in the log-likelihood. ebnm
provides similar
functions to customize grids for unimodal and normal scale mixture prior
families.
One potential issue with the NPMLE is that, since it is discrete (as
the above CDF plot makes apparent), observations are variously shrunk
towards one of the support points, which can result in poor interval
estimates. For illustration, we calculate 10% and 90% quantiles:
fit_npmle <- ebnm_add_sampler(fit_npmle)
print(head(quantile(fit_npmle, probs = c(0.1, 0.9))), digits = 3)
#> 10% 90%
#> Khalil Lee 0.276 0.342
#> Chadwick Tromp 0.276 0.342
#> Otto Lopez 0.276 0.342
#> James Outman 0.276 0.342
#> Matt Carpenter 0.309 0.430
#> Aaron Judge 0.419 0.430
Each credible interval bound is constrained to lie at one of the
support points of the NPMLE \(\hat{g}\). The interval estimate for Judge
strikes us as far too narrow. Indeed, the NPMLE can sometimes yield
degenerate interval estimates:
confint(fit_npmle, level = 0.8, parm = "Aaron Judge")
#> CI.lower CI.upper
#> Aaron Judge 0.4298298 0.4298298
To address this issue, the deconvolveR
package (Narasimhan and Efron 2020) uses a penalized
likelihood that encourages “smooth” priors \(\hat{g}\); that is, priors \(\hat{g}\) for which few of the mixture
proportions are zero:
fit_deconv <- ebnm_deconvolver(x / s, output = ebnm_output_all())
plot(fit_deconv, incl_cdf = TRUE, incl_pm = FALSE)
Note, however, that the “true” means \(\theta\) being estimated are \(z\)-scores rather than raw wOBA skill, as
package deconvolveR
fits a model to \(z\)-scores rather than observations and
associated standard errors. While this is reasonable in many settings,
it does not seem appropriate for the present analysis:
print(head(quantile(fit_deconv, probs = c(0.1, 0.9)) * s), digits = 3)
#> 10% 90%
#> Khalil Lee 0.400 2.000
#> Chadwick Tromp 0.563 1.127
#> Otto Lopez 0.354 0.796
#> James Outman 0.412 0.742
#> Matt Carpenter 0.413 0.531
#> Aaron Judge 0.406 0.459
These interval estimates do not match our basic intuitions; for
example, a wOBA over .600 has never been sustained over a full
season.
References
Bhadra, Anindya, Jyotishka Datta, Nicholas G. Polson, and Brandon
Willard. 2019. “Lasso Meets Horseshoe: A Survey.”
Statistical Science 34 (3): 405–27.
Brown, Lawrence D. 2008. “In-Season Prediction of Batting
Averages: A Field Test of Empirical Bayes and
Bayes Methodologies.” Annals of Applied
Statistics 2 (1): 113–52.
Chen, Ming-Hui, and Qi-Man Shao. 1999. “Monte
Carlo Estimation of Bayesian Credible and
HPD Intervals.” Journal of Computational and
Graphical Statistics 8 (1): 69–92.
E. P. Box., and George C. Tiao. 1965. “Multiparameter Problems
from a Bayesian Point of View.” Annals of Mathematical
Statistics 36 (5): 1468–82.
Efron, Bradley. 2010. Large-Scale Inference: Empirical
Bayes Methods for Estimation, Testing, and Prediction.
Vol. 1. Institute of Mathematical Statistics Monographs. Cambridge, UK:
Cambridge University Press.
Efron, Bradley, and Carl Morris. 1972. “Limiting the Risk of
Bayes and Empirical Bayes
Estimators—Part II: The Empirical
Bayes Case.” Journal of the American Statistical
Association 67 (337): 130–39.
Gu, Jiaying, and Roger Koenker. 2017. “Empirical
Bayesball Remixed: Empirical Bayes Methods for
Longitudinal Data.” Journal of Applied Econometrics 32
(3): 575–99.
Jiang, Wenhua, and Cun-Hui Zhang. 2010. “Empirical
Bayes In-Season Prediction of Baseball Batting
Averages.” In Borrowing Strength: Theory Powering
Applications—a Festschrift for Lawrence
D. Brown, 6:263–73. Institute of
Mathematical Statistics Collections. Beachwood, OH: Institute of
Mathematical Statistics.
Johnstone, Iain. 2019.
“Gaussian Estimation: Sequence and Wavelet
Models.” https://imjohnstone.su.domains/.
Kim, Youngseok, Peter Carbonetto, Matthew Stephens, and Mihai Anitescu.
2020. “A Fast Algorithm for Maximum Likelihood Estimation of
Mixture Proportions Using Sequential Quadratic Programming.”
Journal of Computational and Graphical Statistics 29 (2):
261–73.
Morris, Carl N. 1983. “Parametric Empirical Bayes
Inference: Theory and Applications.” Journal of the American
Statistical Association 78 (381): 47–55.
Narasimhan, Balasubramanian, and Bradley Efron. 2020.
“deconvolveR: A g-Modeling Program for Deconvolution and Empirical
Bayes Estimation.” Journal of Statistical
Software 94 (11): 1–20.
Robbins, Herbert. 1951. “Asymptotically Subminimax Solutions of
Compound Statistical Decision Problems.” In Proceedings of
the Second Berkeley Symposium on
Mathematical Statistics and
Probability, 1951, Vol. II, 131–49.
University of California Press, Berkeley; Los Angeles, CA.
———. 1956. “An Empirical Bayes Approach to
Statistics.” In Proceedings of the Third
Berkeley Symposium on
Mathematical Statistics and
Probability, 1956, Vol. I, 157–63.
University of California Press, Berkeley; Los Angeles, CA.
Robinson, David. 2017.
“Introduction to Empirical
Bayes: Examples from Baseball Statistics.” https://github.com/dgrtwo/empirical-bayes-book.
Sharpe, Sam. 2019.
“An Introduction to Expected Weighted
On-Base Average (xwOBA).” MLB Technology
Blog.
https://technology.mlblogs.com/an-introduction-to-expected-weighted-on-base-average-xwoba-29d6070ba52b.
Stephens, Matthew. 2017. “False Discovery Rates: A New
Deal.” Biostatistics 18 (2): 275–94.
Sun, Lei. 2020. “Topics on Empirical Bayes Normal Means.”
PhD thesis, Chicago, IL: University of Chicago.
Tango, T. M., M. G. Lichtman, and A. E. Dolphin. 2006. The Book:
Playing the Percentages in Baseball. TMA Press.
Wickham, Hadley. 2016. ggplot2: Elegant
Graphics for Data Analysis. New York, NY: Springer-Verlag.
Willwerscheid, Jason. 2021. “Empirical Bayes Matrix
Factorization: Methods and Applications.” PhD thesis, Chicago,
IL: University of Chicago.