bestLogConstant
, that uses the same
machinery to pick the best value of a constant to use when logging a
variable, e.g. the one that makes the distribution look the most normal,
especially useful for non-positive or zero-inflated data. Currently
experimental.step_orderNorm()
to work with
parallel processing.step_best_normalize()
to work
with parallel processing.boxcox
in response to issue
10; thank you to Krzysztof Dyba (kadyb) for the suggestions.yeojohnson
, thanks to Emil
Hvitfeldt (EmilHvitfeldt)
for his work on this problem for the recipes
package here.tidy
method to work more generally,
provide easy access to chosen transformations (responding to issue
9)usethis
in response to issue
7n_logit_fit
argument, with default of 10000. This should
substantially decrease memory use of orderNorm
while only
minimally affecting the out-of-domain approximations.step_bestNormalize
to
step_best_normalize
, responding to 8LambertW
transformation types (thank you to Georg M. Goerg, the author of
LambertW
, for pointing this out).center_scale
transform as default when
standardize == TRUE
T
and F
to TRUE
and FALSE
scales
and
ggplot2
to visualize all transformations.butcher
and axe
functionality in
order to improve scalability of step_*
functionstidy
functionality with bestNormalize and
step_best_normalize
bestNormalize
standardize
option from
no_transform
so x.t
always matches input
vector.step_bestNormalize
and
step_orderNorm
functions for implementation within
recipes
.warn = FALSE
when calling
bestNormalize
. If a transformation doesn’t work, warnings
will no longer be shown by default unless warn
is
set to TRUE
.plot.bestNormalize
which was
improperly labeling transformationsexp_x
having trouble with standardize
option, so added option allow_exp_x
to
bestNormalize
to allow a workaround, and changed it so if
any infinite values are produced during the transformation, exp_x will
not work (that way, bestNormalize
will not include this in
its results).quiet
is
FALSE
and length(x) > 2000
loo
for leave-one-out cross-validationbestNormalize
function via allow_lambert_h
argument.Added feature to estimate out-of-sample normality statistics in bestNormalize instead of in-sample ones via repeated cross-validation
out_of_sample = FALSE
to maintain
backward-compatibility with prior versions and set
allow_orderNorm = FALSE
as well so that it isn’t
automatically selectedImproved extrapolation of the ORQ (orderNorm) method
Added plotting feature for transformation objects
Cleared up some documentation