This vignette is intended to showcase the usage of the
gghalves
extension by going through the individual
_half_
geom
s to explain details of usage and
function arguments.
The general idea of gghalves
stems from this
StackOverflow question on how to plot a hybrid boxplot. This led to me
developing the ggpol
extension for ggplot2
. However, the fact that
ggpol
has become a sort of aggregation for all kinds of
geom
s over time, and seeing that many things can be cut
in half, has ultimately led to this library.
The idea is that many geom
s that aggregate data, such as
geom_boxplot
, geom_violin
and
geom_dotplot
are (near) symmetric. Given that the space to
display information is limited, we can make better use of it by cutting
the geom
s in half and displaying additional
geom
s that e.g. give information about the sample size.
GeomHalfPoint
, perhaps counterintuitively, does not
display a literal half-circle. Rather, it plots the data points such
that
_half_
geomFurther, by default geom_half_point
jitters the points horizontally and vertically.
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_point()
The way this works is that
transformation = PositionJitter
is passed to the
geom
. We could play with the default values of this
transformation by passing along a transformation_params
argument
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_point(transformation_params = list(height = 0, width = 0.001, seed = 1))
#> Warning in geom_half_point(transformation_params = list(height = 0, width =
#> 0.001, : Ignoring unknown parameters: `transformation_params`
or we could change the transformation
argument
itself:
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_point(transformation = PositionIdentity)
Making the transformation work with custom Position
s
from ggplot2
extensions is something that will hopefully be
included in future updates of this package.
Sometimes we want to color points within the aes()
groupings. In that case, we can make use of
geom_half_point_panel()
.
ggplot(iris, aes(y = Sepal.Width)) +
geom_half_boxplot() +
geom_half_point_panel(aes(x = 0.5, color = Species), range_scale = .5)
Like all _half_
geoms, geom_half_point
also
takes a side
argument, with l
for left and
r
for right.
GeomHalfBoxplot
displays a boxplot that is cut in half
and plotted either on the left or right side of the space allotted to
the specific factor on the x-axis.
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_boxplot()
Additionally to the standard side
argument, you can also
center
the half-boxplot and decide whether an errorbar is
drawn or not.
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_boxplot(side = "r", center = TRUE, errorbar.draw = FALSE)
GeomHalfViolin
draws a half-violin plot. Besides the
side
argument, it supports all the arguments that can be
passed to the standard GeomViolin
.
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_violin()
Furthermore, if we have a binary grouping variable (such as
control/treatment) we can plot side-by-side comparisons with the
optional split
aesthetic:
ggplot() +
geom_half_violin(
data = ToothGrowth,
aes(x = as.factor(dose), y = len, split = supp, fill = supp),
position = "identity"
)
GeomHalfDotplot
is slightly different from the other
_half_
geom
s in that it does not support a
side
argument, since this is already inherently built into
the standard GeomDotplot
via stackdir
:
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_violin() +
geom_dotplot(binaxis = "y", method="histodot", stackdir="up")
#> Bin width defaults to 1/30 of the range of the data. Pick better value with
#> `binwidth`.
So, given that geom_dotplot
can be used as a
_half_
geom
, why the need for
geom_half_dotplot
? The reason is that
geom_dotplot
does not support dodging when there are
multiple factors in play. Let’s consider the following example:
<- data.frame(score = rgamma(150, 4, 1),
df gender = sample(c("M", "F"), 150, replace = TRUE),
genotype = factor(sample(1:3, 150, replace = TRUE)))
Given this data, we want to group by genotype
, but also
separate the plots by gender
. This does not quite work
using the standard geom
:
ggplot(df, aes(x = genotype, y = score, fill = gender)) +
geom_half_violin() +
geom_dotplot(binaxis = "y", method="histodot", stackdir="up", position = PositionDodge)
#> Bin width defaults to 1/30 of the range of the data. Pick better value with
#> `binwidth`.
Using geom_half_dotplot
, however, we can make this
work:
ggplot(df, aes(x = genotype, y = score, fill = gender)) +
geom_half_violin() +
geom_half_dotplot(method="histodot", stackdir="up")
#> Bin width defaults to 1/30 of the range of the data. Pick better value with
#> `binwidth`.
As mentioned in the package description, gghalves
can
work well in combination with certain ggplot2
extensions.
One of them is geom_beeswarm
of the ggbeeswarm
package. Note that, currently, you will need to install the latest
version from GitHub to support the passing of
beeswarmArgs
.
ggplot(iris, aes(x = Species, y = Sepal.Width)) +
geom_half_boxplot() +
geom_beeswarm(beeswarmArgs = list(side = 1))
Lastly, let us remake the plot displayed in the GitHub Readme. It is
for display-purposes only, and thus uses a lot of filtering and a lot of
geom
s…
ggplot() +
geom_half_boxplot(
data = iris %>% filter(Species=="setosa"),
aes(x = Species, y = Sepal.Length, fill = Species), outlier.color = NA) +
::geom_beeswarm(
ggbeeswarmdata = iris %>% filter(Species=="setosa"),
aes(x = Species, y = Sepal.Length, fill = Species, color = Species), beeswarmArgs=list(side=+1)
+
)
geom_half_violin(
data = iris %>% filter(Species=="versicolor"),
aes(x = Species, y = Sepal.Length, fill = Species), side="r") +
geom_half_dotplot(
data = iris %>% filter(Species=="versicolor"),
aes(x = Species, y = Sepal.Length, fill = Species), method="histodot", stackdir="down") +
geom_half_boxplot(
data = iris %>% filter(Species=="virginica"),
aes(x = Species, y = Sepal.Length, fill = Species), side = "r", errorbar.draw = TRUE,
outlier.color = NA) +
geom_half_point(
data = iris %>% filter(Species=="virginica"),
aes(x = Species, y = Sepal.Length, fill = Species, color = Species), side = "l") +
scale_fill_manual(values = c("setosa" = "#cba1d2", "versicolor"="#7067CF","virginica"="#B7C0EE")) +
scale_color_manual(values = c("setosa" = "#cba1d2", "versicolor"="#7067CF","virginica"="#B7C0EE")) +
theme(legend.position = "none")