First, let's load some packages including HTSSIP
.
library(dplyr)
library(ggplot2)
library(HTSSIP)
Also let's get an overview of the phyloseq object that we're going to use.
physeq_S2D2
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 139 samples ]
## sample_data() Sample Data: [ 139 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
See HTSSIP introduction vignette for a description on why dataset parsing is needed.
Let's the parameters for parsing the dataset into individual treatment-control comparisons.
params = get_treatment_params(physeq_S2D2, c('Substrate', 'Day'))
params
## Substrate Day
## 1 12C-Con 14
## 2 13C-Cel 3
## 3 13C-Cel 14
## 4 13C-Glu 14
## 5 12C-Con 3
## 6 13C-Glu 3
We just need the parameters for each treatment, so let's filter out the controls. In this case, the controls are all '12C-Con'.
params = dplyr::filter(params, Substrate!='12C-Con')
params
## Substrate Day
## 1 13C-Cel 3
## 2 13C-Cel 14
## 3 13C-Glu 14
## 4 13C-Glu 3
Now, we will use an expression that will subset the phyloseq object into the comparisons that we want to make.
ex
is the expression that will be used for pruning the phyloseq object
ex = "(Substrate=='12C-Con' & Day=='${Day}') | (Substrate=='${Substrate}' & Day == '${Day}')"
physeq_S2D2_l = phyloseq_subset(physeq_S2D2, params, ex)
physeq_S2D2_l
## $`(Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 46 samples ]
## sample_data() Sample Data: [ 46 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
##
## $`(Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Cel' & Day == '14')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 46 samples ]
## sample_data() Sample Data: [ 46 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
##
## $`(Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Glu' & Day == '14')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 47 samples ]
## sample_data() Sample Data: [ 47 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
##
## $`(Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Glu' & Day == '3')`
## phyloseq-class experiment-level object
## otu_table() OTU Table: [ 1072 taxa and 46 samples ]
## sample_data() Sample Data: [ 46 samples by 17 sample variables ]
## tax_table() Taxonomy Table: [ 1072 taxa by 8 taxonomic ranks ]
## phy_tree() Phylogenetic Tree: [ 1072 tips and 1071 internal nodes ]
Now, let's actually make ordinations of each treatment compared to the control. This will return a data.frame
object for plotting.
# running in parallel
doParallel::registerDoParallel(2)
physeq_S2D2_l_df = SIP_betaDiv_ord(physeq_S2D2_l, parallel=TRUE)
## Run 0 stress 0.06000133
## Run 1 stress 0.06000079
## ... New best solution
## ... Procrustes: rmse 4.76634e-05 max resid 0.0001183287
## ... Similar to previous best
## Run 2 stress 0.05999878
## ... New best solution
## ... Procrustes: rmse 0.0007673419 max resid 0.002174231
## ... Similar to previous best
## Run 3 stress 0.06000025
## ... Procrustes: rmse 0.0002583214 max resid 0.0007257033
## ... Similar to previous best
## Run 4 stress 0.06000132
## ... Procrustes: rmse 0.0008298318 max resid 0.002359842
## ... Similar to previous best
## Run 5 stress 0.05999855
## ... New best solution
## ... Procrustes: rmse 0.0002404095 max resid 0.0006772252
## ... Similar to previous best
## Run 6 stress 0.06000119
## ... Procrustes: rmse 0.0005954045 max resid 0.001672996
## ... Similar to previous best
## Run 7 stress 0.06000162
## ... Procrustes: rmse 0.0007033992 max resid 0.001983281
## ... Similar to previous best
## Run 8 stress 0.05999905
## ... Procrustes: rmse 0.0002689876 max resid 0.0007601238
## ... Similar to previous best
## Run 9 stress 0.06000217
## ... Procrustes: rmse 0.000758332 max resid 0.002137149
## ... Similar to previous best
## Run 10 stress 0.06000222
## ... Procrustes: rmse 0.000682585 max resid 0.00194309
## ... Similar to previous best
## Run 11 stress 0.06000121
## ... Procrustes: rmse 0.0006338837 max resid 0.001786944
## ... Similar to previous best
## Run 12 stress 0.06000015
## ... Procrustes: rmse 0.0005002517 max resid 0.001413552
## ... Similar to previous best
## Run 13 stress 0.06000224
## ... Procrustes: rmse 0.0007031324 max resid 0.00196629
## ... Similar to previous best
## Run 14 stress 0.05999863
## ... Procrustes: rmse 0.0001768579 max resid 0.0005147077
## ... Similar to previous best
## Run 15 stress 0.06000198
## ... Procrustes: rmse 0.0007180126 max resid 0.00202838
## ... Similar to previous best
## Run 16 stress 0.06000269
## ... Procrustes: rmse 0.0007187226 max resid 0.002040055
## ... Similar to previous best
## Run 17 stress 0.06000053
## ... Procrustes: rmse 0.0005480856 max resid 0.001536443
## ... Similar to previous best
## Run 18 stress 0.06000346
## ... Procrustes: rmse 0.0008048583 max resid 0.002260405
## ... Similar to previous best
## Run 19 stress 0.2366197
## Run 20 stress 0.05999882
## ... Procrustes: rmse 0.0002499372 max resid 0.0007488858
## ... Similar to previous best
## *** Solution reached
## Run 0 stress 0.07196424
## Run 1 stress 0.1319375
## Run 2 stress 0.129617
## Run 3 stress 0.1010428
## Run 4 stress 0.07206155
## ... Procrustes: rmse 0.005243278 max resid 0.02280419
## Run 5 stress 0.07196366
## ... New best solution
## ... Procrustes: rmse 0.0001232941 max resid 0.0002892083
## ... Similar to previous best
## Run 6 stress 0.1283992
## Run 7 stress 0.1329558
## Run 8 stress 0.1374138
## Run 9 stress 0.1318414
## Run 10 stress 0.1361472
## Run 11 stress 0.1199625
## Run 12 stress 0.130733
## Run 13 stress 0.07393864
## Run 14 stress 0.1172225
## Run 15 stress 0.07196433
## ... Procrustes: rmse 0.0001492686 max resid 0.0003513592
## ... Similar to previous best
## Run 16 stress 0.07196383
## ... Procrustes: rmse 5.555946e-05 max resid 0.0001380408
## ... Similar to previous best
## Run 17 stress 0.07457581
## Run 18 stress 0.07404662
## Run 19 stress 0.1186507
## Run 20 stress 0.07196427
## ... Procrustes: rmse 0.000372391 max resid 0.001150812
## ... Similar to previous best
## *** Solution reached
## Run 0 stress 0.07255501
## Run 1 stress 0.07255526
## ... Procrustes: rmse 0.0001154799 max resid 0.0004017811
## ... Similar to previous best
## Run 2 stress 0.07255521
## ... Procrustes: rmse 9.374883e-05 max resid 0.0003199492
## ... Similar to previous best
## Run 3 stress 0.07255501
## ... Procrustes: rmse 8.016806e-06 max resid 2.850354e-05
## ... Similar to previous best
## Run 4 stress 0.07255505
## ... Procrustes: rmse 3.009653e-05 max resid 0.0001016472
## ... Similar to previous best
## Run 5 stress 0.07255502
## ... Procrustes: rmse 2.06275e-05 max resid 7.843908e-05
## ... Similar to previous best
## Run 6 stress 0.07255504
## ... Procrustes: rmse 4.134275e-05 max resid 0.0001448403
## ... Similar to previous best
## Run 7 stress 0.07255501
## ... Procrustes: rmse 7.638422e-06 max resid 4.579753e-05
## ... Similar to previous best
## Run 8 stress 0.07255533
## ... Procrustes: rmse 0.0001229787 max resid 0.0004466971
## ... Similar to previous best
## Run 9 stress 0.07255507
## ... Procrustes: rmse 1.723695e-05 max resid 4.729386e-05
## ... Similar to previous best
## Run 10 stress 0.07255507
## ... Procrustes: rmse 5.510571e-05 max resid 0.0001901477
## ... Similar to previous best
## Run 11 stress 0.07255501
## ... Procrustes: rmse 4.899759e-06 max resid 1.788823e-05
## ... Similar to previous best
## Run 12 stress 0.07255504
## ... Procrustes: rmse 3.884768e-05 max resid 0.0001316485
## ... Similar to previous best
## Run 13 stress 0.07255501
## ... Procrustes: rmse 3.915428e-06 max resid 1.688773e-05
## ... Similar to previous best
## Run 14 stress 0.07255534
## ... Procrustes: rmse 0.0001334698 max resid 0.000458489
## ... Similar to previous best
## Run 15 stress 0.07255505
## ... Procrustes: rmse 2.318445e-05 max resid 7.457831e-05
## ... Similar to previous best
## Run 16 stress 0.07255504
## ... Procrustes: rmse 3.816817e-05 max resid 0.0001420784
## ... Similar to previous best
## Run 17 stress 0.07255501
## ... Procrustes: rmse 1.240037e-05 max resid 4.174521e-05
## ... Similar to previous best
## Run 18 stress 0.07255507
## ... Procrustes: rmse 4.521679e-05 max resid 0.0001653047
## ... Similar to previous best
## Run 19 stress 0.07255509
## ... Procrustes: rmse 6.734897e-05 max resid 0.0002079885
## ... Similar to previous best
## Run 20 stress 0.07255515
## ... Procrustes: rmse 8.612619e-05 max resid 0.0003172871
## ... Similar to previous best
## *** Solution reached
## Run 0 stress 0.1080816
## Run 1 stress 0.1013612
## ... New best solution
## ... Procrustes: rmse 0.03458739 max resid 0.1340111
## Run 2 stress 0.1080834
## Run 3 stress 0.1013618
## ... Procrustes: rmse 0.001106331 max resid 0.004059074
## ... Similar to previous best
## Run 4 stress 0.1080811
## Run 5 stress 0.1080837
## Run 6 stress 0.101361
## ... New best solution
## ... Procrustes: rmse 0.0009579509 max resid 0.003513939
## ... Similar to previous best
## Run 7 stress 0.1013639
## ... Procrustes: rmse 0.001302598 max resid 0.004787408
## ... Similar to previous best
## Run 8 stress 0.1080829
## Run 9 stress 0.1013646
## ... Procrustes: rmse 0.001364516 max resid 0.005018659
## ... Similar to previous best
## Run 10 stress 0.1080837
## Run 11 stress 0.108083
## Run 12 stress 0.1080817
## Run 13 stress 0.1013601
## ... New best solution
## ... Procrustes: rmse 0.0007068489 max resid 0.002555171
## ... Similar to previous best
## Run 14 stress 0.1080826
## Run 15 stress 0.1080859
## Run 16 stress 0.1080818
## Run 17 stress 0.1013634
## ... Procrustes: rmse 0.0005336088 max resid 0.001998047
## ... Similar to previous best
## Run 18 stress 0.1080832
## Run 19 stress 0.1450325
## Run 20 stress 0.1013601
## ... New best solution
## ... Procrustes: rmse 2.790567e-05 max resid 0.0001200085
## ... Similar to previous best
## *** Solution reached
physeq_S2D2_l_df %>% head(n=3)
## phyloseq_subset
## 1 (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## 2 (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## 3 (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## NMDS1 NMDS2 X.Sample primer_number fwd_barcode
## 1 -0.15456305 -0.035393733 13C-Cel.D3.R1_F21 7 GGATATCT
## 2 -0.20505799 -0.101701050 13C-Cel.D3.R1_F20 6 CGTGAGTG
## 3 -0.02321259 -0.002126514 13C-Cel.D3.R1_F15 1 ATCGTACG
## rev_barcode Substrate Day Microcosm_replicate Fraction Buoyant_density
## 1 CGAGAGTT 13C-Cel 3 1 21 1.705640
## 2 CGAGAGTT 13C-Cel 3 1 20 1.706186
## 3 CGAGAGTT 13C-Cel 3 1 15 1.725310
## Sample_type
## 1 unknown
## 2 unknown
## 3 unknown
## library
## 1 /var/seq_data/fullCyc/MiSeq_16SrRNA/515f-806r/lib1-7rs/../150721_lib4/run1/metadata.txt
## 2 /var/seq_data/fullCyc/MiSeq_16SrRNA/515f-806r/lib1-7rs/../150721_lib4/run1/metadata.txt
## 3 /var/seq_data/fullCyc/MiSeq_16SrRNA/515f-806r/lib1-7rs/../150721_lib4/run1/metadata.txt
## Exp_type Sample_location Sample_date Sample_treatment
## 1 SIP NA NA NA
## 2 SIP NA NA NA
## 3 SIP NA NA NA
## Sample_subtreatment core_dataset
## 1 NA TRUE
## 2 NA TRUE
## 3 NA TRUE
Each specific phyloseq subset (treatment-control comparison) is delimited with the “phyloseq_subset” column.
physeq_S2D2_l_df %>% .$phyloseq_subset %>% unique
## [1] (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3')
## [2] (Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Cel' & Day == '14')
## [3] (Substrate=='12C-Con' & Day=='14') | (Substrate=='13C-Glu' & Day == '14')
## [4] (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Glu' & Day == '3')
## 4 Levels: (Substrate=='12C-Con' & Day=='3') | (Substrate=='13C-Cel' & Day == '3') ...
For clarity, I'm going edit these long strings to make them more readable.
physeq_S2D2_l_df = physeq_S2D2_l_df %>%
dplyr::mutate(phyloseq_subset = gsub(' \\| ', '\n', phyloseq_subset),
phyloseq_subset = gsub('\'3\'', '\'03\'', phyloseq_subset))
physeq_S2D2_l_df %>% .$phyloseq_subset %>% unique
## [1] "(Substrate=='12C-Con' & Day=='03')\n(Substrate=='13C-Cel' & Day == '03')"
## [2] "(Substrate=='12C-Con' & Day=='14')\n(Substrate=='13C-Cel' & Day == '14')"
## [3] "(Substrate=='12C-Con' & Day=='14')\n(Substrate=='13C-Glu' & Day == '14')"
## [4] "(Substrate=='12C-Con' & Day=='03')\n(Substrate=='13C-Glu' & Day == '03')"
OK, let's plot the data!
phyloseq_ord_plot(physeq_S2D2_l_df)
As you can see, the 'heavy' gradient fraction 'communities' for the labeled-treatments tend to diverge from the unlabeled gradient fraction communities, but the amount of divergence in dependent on substrate and time point.
sessionInfo()
## R version 3.4.3 (2017-11-30)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Sierra 10.12.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
##
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] phyloseq_1.22.3 HTSSIP_1.4.1 ggplot2_3.2.0 tidyr_0.8.3
## [5] dplyr_0.8.0.1
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-131 bitops_1.0-6
## [3] matrixStats_0.54.0 bit64_0.9-7
## [5] doParallel_1.0.14 RColorBrewer_1.1-2
## [7] GenomeInfoDb_1.14.0 tools_3.4.3
## [9] backports_1.1.4 utf8_1.1.4
## [11] R6_2.4.0 coenocliner_0.2-2
## [13] vegan_2.5-1 rpart_4.1-15
## [15] Hmisc_4.2-0 DBI_1.0.0
## [17] lazyeval_0.2.2 BiocGenerics_0.24.0
## [19] mgcv_1.8-28 colorspace_1.4-1
## [21] permute_0.9-5 ade4_1.7-13
## [23] nnet_7.3-12 withr_2.1.2
## [25] tidyselect_0.2.5 gridExtra_2.3
## [27] DESeq2_1.18.1 bit_1.1-14
## [29] compiler_3.4.3 cli_1.1.0
## [31] Biobase_2.38.0 htmlTable_1.13.1
## [33] DelayedArray_0.4.1 labeling_0.3
## [35] scales_1.0.0 checkmate_1.9.3
## [37] genefilter_1.60.0 stringr_1.4.0
## [39] digest_0.6.19 foreign_0.8-71
## [41] XVector_0.18.0 base64enc_0.1-3
## [43] pkgconfig_2.0.2 htmltools_0.3.6
## [45] highr_0.8 htmlwidgets_1.3
## [47] rlang_0.4.0 RSQLite_2.1.1
## [49] rstudioapi_0.10 jsonlite_1.6
## [51] BiocParallel_1.12.0 acepack_1.4.1
## [53] RCurl_1.95-4.12 magrittr_1.5
## [55] GenomeInfoDbData_1.0.0 Formula_1.2-3
## [57] biomformat_1.6.0 Matrix_1.2-17
## [59] fansi_0.4.0 Rcpp_1.0.1
## [61] munsell_0.5.0 S4Vectors_0.16.0
## [63] ape_5.3 stringi_1.4.3
## [65] MASS_7.3-51.4 SummarizedExperiment_1.8.1
## [67] zlibbioc_1.24.0 rhdf5_2.22.0
## [69] plyr_1.8.4 blob_1.1.1
## [71] grid_3.4.3 parallel_3.4.3
## [73] crayon_1.3.4 lattice_0.20-38
## [75] Biostrings_2.46.0 splines_3.4.3
## [77] multtest_2.34.0 annotate_1.56.2
## [79] locfit_1.5-9.1 zeallot_0.1.0
## [81] knitr_1.18 pillar_1.4.1
## [83] igraph_1.2.4 GenomicRanges_1.30.3
## [85] markdown_0.9 geneplotter_1.56.0
## [87] reshape2_1.4.3 codetools_0.2-16
## [89] stats4_3.4.3 XML_3.98-1.19
## [91] glue_1.3.1 evaluate_0.14
## [93] latticeExtra_0.6-28 data.table_1.10.4-3
## [95] vctrs_0.1.0 foreach_1.4.4
## [97] gtable_0.3.0 purrr_0.3.2
## [99] assertthat_0.2.1 mime_0.6
## [101] xtable_1.8-4 survival_2.44-1.1
## [103] tibble_2.1.1 iterators_1.0.10
## [105] memoise_1.1.0 AnnotationDbi_1.40.0
## [107] IRanges_2.12.0 cluster_2.0.6