HDNRA

License: GPL-3.0 R-CMD-check

The R package HDNRA includes the latest methods based on normal-reference approach to test the equality of the mean vectors of high-dimensional samples with possibly different covariance matrices. HDNRA is also used to demonstrate the implementation of these tests, catering not only to the two-sample problem, but also to the general linear hypothesis testing (GLHT) problem. This package provides easy and user-friendly access to these tests. Both coded in C++ to allow for reasonable execution time using Rcpp. Besides Rcpp, the package has no strict dependencies in order to provide a stable self-contained toolbox that invites re-use.

There are:

Two real data sets in HDNRA

Seven normal-reference tests for the two-sample problem

Five normal-reference tests for the GLHT problem in HDNRA

Four existing tests for the two-sample problem in HDNRA

Five existing tests for the GLHT problem in HDNRA

Installation

You can install and load the most recent development version of HDNRA from GitHub with:

# Installing from GitHub requires you first install the devtools or remotes package
install.packages("devtools")
# Or
install.packages("remotes")

# install the most recent development version from GitHub
devtools::install_github("nie23wp8738/HDNRA")
# Or
remotes::install_github("nie23wp8738/HDNRA")
# load the most recent development version from GitHub
library(HDNRA)

Usage

Load the package

library(HDNRA)

Example data

Package HDNRA comes with two real data sets:

# A COVID19 data set from NCBI with ID GSE152641 for the two-sample problem.
?COVID19
data(COVID19)
dim(COVID19)
#> [1]    87 20460
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) ## healthy group
dim(group1)
#> [1]    24 20460
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) ## COVID-19 patients
dim(group2)
#> [1]    62 20460

# A corneal data set acquired during a keratoconus study for the GLHT problem.
?corneal
data(corneal)
dim(corneal)
#> [1]  150 2000
group1 <- as.matrix(corneal[1:43, ]) ## normal group
dim(group1)
#> [1]   43 2000
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
dim(group2)
#> [1]   14 2000
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
dim(group3)
#> [1]   21 2000
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
dim(group4)
#> [1]   72 2000

Example for two-sample problem

A simple example of how to use one of the normal-reference tests ZWZ2023.TSBF.2cNRT using data set COVID19:

data("COVID19")
group1 <- as.matrix(COVID19[c(2:19, 82:87), ]) # healthy group1
group2 <- as.matrix(COVID19[-c(1:19, 82:87), ]) # patients group2
# The data matrix for tsbf_zwz2023 should be p by n, sometimes we should transpose the data matrix
ZWZ2023.TSBF.2cNRT(group1, group2)
#> 
#> Results of Hypothesis Test
#> --------------------------
#> 
#> Test name:                       Zhu et al. (2023)'s test
#> 
#> Null Hypothesis:                 Difference between two mean vectors is 0
#> 
#> Alternative Hypothesis:          Difference between two mean vectors is not 0
#> 
#> Data:                            group1 and group2
#> 
#> Sample Sizes:                    n1 = 24
#>                                  n2 = 62
#> 
#> Sample Dimension:                20460
#> 
#> Test Statistic:                  T[ZWZ] = 4.1877
#> 
#> Approximation method to the      2-c matched chi^2-approximation
#> null distribution of T[ZWZ]: 
#> 
#> Approximation parameter(s):      df1 =   2.7324
#>                                  df2 = 171.7596
#> 
#> P-value:                         0.008672887

Example for GLHT problem

A simple example of how to use one of the normal-reference tests ZZG2022.GLHTBF.2cNRT using data set corneal:

data("corneal")
dim(corneal)
#> [1]  150 2000
group1 <- as.matrix(corneal[1:43, ]) ## normal group
group2 <- as.matrix(corneal[44:57, ]) ## unilateral suspect group
group3 <- as.matrix(corneal[58:78, ]) ## suspect map group
group4 <- as.matrix(corneal[79:150, ]) ## clinical keratoconus group
p <- dim(corneal)[2]
Y <- list()
k <- 4
Y[[1]] <- group1
Y[[2]] <- group2
Y[[3]] <- group3
Y[[4]] <- group4
n <- c(nrow(Y[[1]]),nrow(Y[[2]]),nrow(Y[[3]]),nrow(Y[[4]]))
G <- cbind(diag(k-1),rep(-1,k-1))
ZZG2022.GLHTBF.2cNRT(Y,G,n,p)
#> 
#> Results of Hypothesis Test
#> --------------------------
#> 
#> Test name:                       Zhang et al. (2022)'s test
#> 
#> Null Hypothesis:                 The general linear hypothesis is true
#> 
#> Alternative Hypothesis:          The general linear hypothesis is not true
#> 
#> Data:                            Y
#> 
#> Sample Sizes:                    n1 = 43
#>                                  n2 = 14
#>                                  n3 = 21
#>                                  n4 = 72
#> 
#> Sample Dimension:                2000
#> 
#> Test Statistic:                  T[ZZG] = 159.7325
#> 
#> Approximation method to the      2-c matched chi^2-approximation
#> null distribution of T[ZZG]: 
#> 
#> Approximation parameter(s):      df   = 6.1652
#>                                  beta = 6.1464
#> 
#> P-value:                         0.0002577084

Code of Conduct

Please note that the HDNRA project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms