In this vignette, we demonstrate the unsegmented block bootstrap functionality implemented in nullranges. “Unsegmented” refers to the fact that this implementation does not consider segmentation of the genome for sampling of blocks, see the segmented block bootstrap vignette for the alternative implementation.

Timing on DHS peaks

First we use the DNase hypersensitivity peaks in A549 downloaded from AnnotationHub, and pre-processed as described in the nullrangesData package.

library(nullrangesData)
dhs <- DHSA549Hg38()
library(nullranges)

The following chunk of code evaluates various types of bootstrap/permutation schemes, first within chromosome, and then across chromosome (the default). The default type is bootstrap, and the default for withinChrom is FALSE (bootstrapping with blocks moving across chromosomes).

set.seed(5) # reproducibility
library(microbenchmark)
blockLength <- 5e5
microbenchmark(
  list=alist(
    p_within=bootRanges(dhs, blockLength=blockLength,
                        type="permute", withinChrom=TRUE),
    b_within=bootRanges(dhs, blockLength=blockLength,
                        type="bootstrap", withinChrom=TRUE),
    p_across=bootRanges(dhs, blockLength=blockLength,
                        type="permute", withinChrom=FALSE),
    b_across=bootRanges(dhs, blockLength=blockLength,
                        type="bootstrap", withinChrom=FALSE)
  ), times=10)
## Unit: milliseconds
##      expr       min        lq      mean    median        uq       max neval cld
##  p_within 1611.0315 1643.5174 1771.0713 1659.9201 1703.3682 2736.9732    10   c
##  b_within 1451.0458 1484.8843 1523.6632 1532.7313 1562.1934 1606.0634    10  b 
##  p_across  358.9839  385.4329  406.2684  416.1468  421.4092  442.6818    10 a  
##  b_across  406.2425  433.9784  452.2765  452.4915  459.0252  518.9751    10 a

Visualize on synthetic data

We create some synthetic ranges in order to visualize the different options of the unsegmented bootstrap implemented in nullranges.

library(GenomicRanges)
seq_nms <- rep(c("chr1","chr2","chr3"),c(4,5,2))
gr <- GRanges(seqnames=seq_nms,
              IRanges(start=c(1,101,121,201,
                              101,201,216,231,401,
                              1,101),
                      width=c(20, 5, 5, 30,
                              20, 5, 5, 5, 30,
                              80, 40)),
              seqlengths=c(chr1=300,chr2=450,chr3=200),
              chr=factor(seq_nms))

The following function uses functionality from plotgardener to plot the ranges. Note in the plotting helper function that chr will be used to color ranges by chromosome of origin.

suppressPackageStartupMessages(library(plotgardener))
plotGRanges <- function(gr) {
  pageCreate(width = 5, height = 2, xgrid = 0,
                ygrid = 0, showGuides = FALSE)
  for (i in seq_along(seqlevels(gr))) {
    chrom <- seqlevels(gr)[i]
    chromend <- seqlengths(gr)[[chrom]]
    suppressMessages({
      p <- pgParams(chromstart = 0, chromend = chromend,
                    x = 0.5, width = 4*chromend/500, height = 0.5,
                    at = seq(0, chromend, 50),
                    fill = colorby("chr", palette=palette.colors))
      prngs <- plotRanges(data = gr, params = p,
                          chrom = chrom,
                          y = 0.25 + (i-1)*.7,
                          just = c("left", "bottom"))
      annoGenomeLabel(plot = prngs, params = p, y = 0.30 + (i-1)*.7)
    })
  }
}
plotGRanges(gr)

Within chromosome

Visualizing two permutations of blocks within chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=100, type="permute", withinChrom=TRUE)
  plotGRanges(gr_prime)
}

Visualizing two bootstraps within chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=100, withinChrom=TRUE)
  plotGRanges(gr_prime)
}

Across chromosome

Visualizing two permutations of blocks across chromosome. Here we use larger blocks than previously.

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=200, type="permute", withinChrom=FALSE)
  plotGRanges(gr_prime)
}

Visualizing two bootstraps across chromosome:

for (i in 1:2) {
  gr_prime <- bootRanges(gr, blockLength=200, withinChrom=FALSE)
  plotGRanges(gr_prime)
}

Session information

sessionInfo()
## R version 4.2.1 (2022-06-23)
## Platform: aarch64-apple-darwin20 (64-bit)
## Running under: macOS Ventura 13.0
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] grid      stats4    stats     graphics  grDevices utils     datasets 
## [8] methods   base     
## 
## other attached packages:
##  [1] microbenchmark_1.4.9        purrr_0.3.4                
##  [3] ggridges_0.5.3              tidyr_1.2.0                
##  [5] EnsDb.Hsapiens.v86_2.99.0   ensembldb_2.22.0           
##  [7] AnnotationFilter_1.22.0     GenomicFeatures_1.50.2     
##  [9] AnnotationDbi_1.60.0        patchwork_1.1.1            
## [11] plyranges_1.18.0            nullrangesData_1.3.0       
## [13] ExperimentHub_2.6.0         AnnotationHub_3.6.0        
## [15] BiocFileCache_2.6.0         dbplyr_2.2.1               
## [17] ggplot2_3.3.6               plotgardener_1.4.1         
## [19] nullranges_1.4.0            InteractionSet_1.26.0      
## [21] SummarizedExperiment_1.28.0 Biobase_2.58.0             
## [23] MatrixGenerics_1.10.0       matrixStats_0.62.0         
## [25] GenomicRanges_1.50.1        GenomeInfoDb_1.34.2        
## [27] IRanges_2.32.0              S4Vectors_0.36.0           
## [29] BiocGenerics_0.44.0        
## 
## loaded via a namespace (and not attached):
##   [1] plyr_1.8.7                    RcppHMM_1.2.2                
##   [3] lazyeval_0.2.2                splines_4.2.1                
##   [5] BiocParallel_1.32.1           TH.data_1.1-1                
##   [7] digest_0.6.29                 yulab.utils_0.0.5            
##   [9] htmltools_0.5.2               fansi_1.0.3                  
##  [11] magrittr_2.0.3                memoise_2.0.1                
##  [13] ks_1.13.5                     Biostrings_2.66.0            
##  [15] sandwich_3.0-2                prettyunits_1.1.1            
##  [17] jpeg_0.1-9                    colorspace_2.0-3             
##  [19] blob_1.2.3                    rappdirs_0.3.3               
##  [21] xfun_0.31                     dplyr_1.0.9                  
##  [23] crayon_1.5.1                  RCurl_1.98-1.7               
##  [25] jsonlite_1.8.0                survival_3.3-1               
##  [27] zoo_1.8-10                    glue_1.6.2                   
##  [29] gtable_0.3.0                  zlibbioc_1.44.0              
##  [31] XVector_0.38.0                strawr_0.0.9                 
##  [33] DelayedArray_0.24.0           scales_1.2.0                 
##  [35] mvtnorm_1.1-3                 DBI_1.1.3                    
##  [37] Rcpp_1.0.9                    xtable_1.8-4                 
##  [39] progress_1.2.2                gridGraphics_0.5-1           
##  [41] bit_4.0.4                     mclust_5.4.10                
##  [43] httr_1.4.3                    RColorBrewer_1.1-3           
##  [45] speedglm_0.3-4                ellipsis_0.3.2               
##  [47] pkgconfig_2.0.3               XML_3.99-0.10                
##  [49] farver_2.1.1                  sass_0.4.1                   
##  [51] utf8_1.2.2                    DNAcopy_1.72.0               
##  [53] ggplotify_0.1.0               tidyselect_1.1.2             
##  [55] labeling_0.4.2                rlang_1.0.4                  
##  [57] later_1.3.0                   munsell_0.5.0                
##  [59] BiocVersion_3.16.0            tools_4.2.1                  
##  [61] cachem_1.0.6                  cli_3.3.0                    
##  [63] generics_0.1.3                RSQLite_2.2.14               
##  [65] evaluate_0.15                 stringr_1.4.0                
##  [67] fastmap_1.1.0                 yaml_2.3.5                   
##  [69] knitr_1.39                    bit64_4.0.5                  
##  [71] KEGGREST_1.38.0               mime_0.12                    
##  [73] pracma_2.3.8                  xml2_1.3.3                   
##  [75] biomaRt_2.54.0                compiler_4.2.1               
##  [77] filelock_1.0.2                curl_4.3.2                   
##  [79] png_0.1-7                     interactiveDisplayBase_1.36.0
##  [81] tibble_3.1.7                  bslib_0.3.1                  
##  [83] stringi_1.7.8                 highr_0.9                    
##  [85] lattice_0.20-45               ProtGenerics_1.30.0          
##  [87] Matrix_1.4-1                  vctrs_0.4.1                  
##  [89] pillar_1.7.0                  lifecycle_1.0.1              
##  [91] BiocManager_1.30.18           jquerylib_0.1.4              
##  [93] data.table_1.14.2             bitops_1.0-7                 
##  [95] httpuv_1.6.5                  rtracklayer_1.58.0           
##  [97] R6_2.5.1                      BiocIO_1.8.0                 
##  [99] promises_1.2.0.1              KernSmooth_2.23-20           
## [101] codetools_0.2-18              MASS_7.3-58                  
## [103] assertthat_0.2.1              rjson_0.2.21                 
## [105] withr_2.5.0                   GenomicAlignments_1.34.0     
## [107] Rsamtools_2.14.0              multcomp_1.4-19              
## [109] GenomeInfoDbData_1.2.8        parallel_4.2.1               
## [111] hms_1.1.1                     rmarkdown_2.14               
## [113] shiny_1.7.1                   restfulr_0.0.15