Introduction
In this vignette, CTX Hazard API will be explored.
NOTE: Please see the introductory vignette for an overview of the ccdR package and initial set up instruction with API key storage.
Data for the Hazard API come from the Toxicity Value Database (ToxValDB). ToxValDB includes data on thousands of chemicals from tens of thousands of records, with an emphasis on quantitative estimates of relevant points-of-departure from in vivo toxicology studies, such as no- and low-observable adverse effect levels, screening levels, reference doses, tolerable daily intake, etc.
The Aggregated Computational Toxicology Resource (ACToR) is currently being integrated into ToxValDB. ACToR, as described in Judson et al (2008), was designed to serve as a central location for information on chemical structure in vitro bioassays, and in vivo toxicology assays used in various Computational Toxicology efforts at US EPA.
More information on ToxValDB can be found at https://www.epa.gov/comptox-tools/downloadable-computational-toxicology-data#AT. Additional resources under “ToxVal” subtopic: New Approach Methods training.
Functions
Several ccdR functions are used to access the CTX Hazard API data.
Hazard Resource
Get Hazard Data by DTXSID
get_hazard_by_dtxsid()
retrieves all hazard data, both
human and EcoTox data.
Get Human Hazard Data by DTXSID
get_human_hazard_by_dtxsid()
retrieves only human hazard
data.
Skin Eye Resource
get_skin_eye_hazard()
retrieves hazard data specific to
skin and eye hazard.
Cancer Resource
get_cancer_hazard()
retrieves cancer hazard data.
Genetox Resource
get_genetox_summary()
retrieves summary level data for
genotoxicity data associated to a chemical.
get_genetox_detail()
retrieves more detailed genetox
data for a chemical than is provided on the summary level.
Example Use Case: Comparing Hazard Data Across Chemical Lists
The fourth Drinking Water Contaminant Candidate List (CCL4) is a set of chemicals that “…are not subject to any proposed or promulgated national primary drinking water regulations, but are known or anticipated to occur in public water systems….” Moreover, this list “…was announced on November 17, 2016. The CCL 4 includes 97 chemicals or chemical groups and 12 microbial contaminants….” The National-Scale Air Toxics Assessments (NATA) is “… EPA’s ongoing comprehensive evaluation of air toxics in the United States… a state-of-the-science screening tool for State/Local/Tribal agencies to prioritize pollutants, emission sources and locations of interest for further study in order to gain a better understanding of risks… use general information about sources to develop estimates of risks which are more likely to overestimate impacts than underestimate them….”
These lists can be found in the CCD at CCL4 with additional information at CCL4 information and NATADB with additional information at NATA information. The quotes from the previous paragraph were excerpted from list detail descriptions found using the CCD links.
In this example use case, hazard data will be compared between a water contaminant priority and an air toxics list.
Obtain Lists of Chemicals
First, confirm the chemical list to query.
options(width = 100)
ccl4_information <- get_public_chemical_list_by_name('CCL4')
print(ccl4_information, trunc.cols = TRUE)
#> id type label visibility
#> 1 443 federal WATER|EPA: Chemical Contaminants - CCL 4 PUBLIC
#> longDescription
#> 1 The Contaminant Candidate List (CCL) is a list of contaminants that, at the time of publication, are not subject to any proposed or promulgated national primary drinking water regulations, but are known or anticipated to occur in public water systems. Contaminants listed on the CCL may require future regulation under the Safe Drinking Water Act (SDWA). EPA announced the <a href='https://www.epa.gov/ccl/contaminant-candidate-list-4-ccl-4-0' target='_blank'>fourth Drinking Water Contaminant Candidate List (CCL 4)</a> on November 17, 2016. The CCL 4 includes 97 chemicals or chemical groups and 12 microbial contaminants. The group of cyanotoxins on CCL 4 includes, but is not limited to: anatoxin-a, cylindrospermopsin, microcystins, and saxitoxin. The CCL Chemical Candidate Lists are versioned iteratively and this description navigates between the various versions of the lists. The list of substances displayed below represents only the chemical CCL 4 contaminants. For the versioned lists, please use the hyperlinked lists below.<br/><br/> \r\n\r\n<a href='https://comptox.epa.gov/dashboard/chemical_lists/CCL5' target='_blank'>CCL5 - November 2022</a> <br/><br/>\r\n<a href='https://comptox.epa.gov/dashboard/chemical_lists/CCL4' target='_blank'>CCL4 - November 2016</a> \r\n This list<br/><br/>\r\n<a href='https://comptox.epa.gov/dashboard/chemical_lists/CCL3' target='_blank'>CCL3 - October 2009</a> <br/><br/>\r\n<a href='https://comptox.epa.gov/dashboard/chemical_lists/CCL2' target='_blank'>CCL2 - February 2005</a><br/><br/>\r\n<a href='https://comptox.epa.gov/dashboard/chemical_lists/CCL1' target='_blank'>CCL1 - March 1998</a><br/><br/>
#> updatedAt listName chemicalCount createdAt
#> 1 2022-10-26T21:14:27Z CCL4 100 2017-12-28T17:58:36Z
#> shortDescription
#> 1 The Contaminant Candidate List (CCL) is a list of contaminants that are known or anticipated to occur in public water systems. Version 4 is known as CCL 4.
natadb_information <- get_public_chemical_list_by_name('NATADB')
print(natadb_information, trunc.cols = TRUE)
#> id type label visibility
#> 1 454 federal EPA: National-Scale Air Toxics Assessment (NATA) PUBLIC
#> longDescription
#> 1 The National-Scale Air Toxics Assessment (NATA) is EPA's ongoing comprehensive evaluation of air toxics in the United States. EPA developed the NATA as a state-of-the-science screening tool for State/Local/Tribal Agencies to prioritize pollutants, emission sources and locations of interest for further study in order to gain a better understanding of risks. NATA assessments do not incorporate refined information about emission sources but, rather, use general information about sources to develop estimates of risks which are more likely to overestimate impacts than underestimate them.\r\n\r\nNATA provides estimates of the risk of cancer and other serious health effects from breathing (inhaling) air toxics in order to inform both national and more localized efforts to identify and prioritize air toxics, emission source types and locations which are of greatest potential concern in terms of contributing to population risk. This in turn helps air pollution experts focus limited analytical resources on areas and or populations where the potential for health risks are highest. Assessments include estimates of cancer and non-cancer health effects based on chronic exposure from outdoor sources, including assessments of non-cancer health effects for Diesel Particulate Matter (PM). Assessments provide a snapshot of the outdoor air quality and the risks to human health that would result if air toxic emissions levels remained unchanged.
#> updatedAt listName chemicalCount createdAt
#> 1 2018-11-16T21:42:01Z NATADB 163 2018-02-21T12:04:16Z
#> shortDescription
#> 1 The National-Scale Air Toxics Assessment (NATA) is EPA's ongoing comprehensive evaluation of air toxics in the United States.
Next, retrieve the list of chemicals associated with each list.
Review Genotoxicity Data for a Single Chemical
Using the standard CompTox Chemicals Dashboard approach to access genotoxicity hazard data, one would navigate to the individual chemical page as shown below.
Figure 2 shows the genotoxicity section of the hazard tab for Bisphenol A. This page provides a summary of available genotoxicity data as well as individual reports and samples of such data.
The CTX APIs streamline the process of retrieving this information in
a programmatic fashion. Figure 3 shows the particular set of
genotoxicity resources available in the Hazard
endpoints of
the CTX APIs. There are both summary and detail resources, reflecting
the information one can find on the CompTox Chemicals Dashboard
Genotoxicity page for a given chemical.
Review Genotoxicity Data for Chemical Lists
The function get_genetox_summary()
is used to access
summary genotoxicity information per chemical. To query a list of
chemicals, rather than searching individually for each chemical, the
batch search version of the function,
get_genetox_summary_batch()
, can be used to access these
details.
First, pull the data.
ccl4_genotox <- get_genetox_summary_batch(DTXSID = ccl4$dtxsid)
natadb_genetox <- get_genetox_summary_batch(DTXSID = natadb$dtxsid)
Next, it may be helpful to examine the dimensions and column names of the output.
dim(ccl4_genotox)
#> [1] 71 7
dim(natadb_genetox)
#> [1] 153 7
colnames(ccl4_genotox)
#> [1] "id" "dtxsid" "reportsPositive" "reportsNegative" "reportsOther"
#> [6] "ames" "micronucleus"
head(ccl4_genotox)
#> id dtxsid reportsPositive reportsNegative reportsOther ames micronucleus
#> <int> <char> <int> <int> <int> <char> <char>
#> 1: 16410 DTXSID0020153 26 5 1 - positive
#> 2: 20381 DTXSID0020446 0 12 0 - negative
#> 3: 21212 DTXSID0020573 3 14 0 - negative
#> 4: 23984 DTXSID0020600 20 0 1 - positive
#> 5: 23751 DTXSID0020814 1 0 0 - -
#> 6: 18199 DTXSID0021464 9 11 0 - positive
The information returned is of the first variety highlighted in the Figure 2, that is, summary data on the available genotoxicity data for each chemical. Observe genotoxicity data was returned for 71 chemicals from the CCL4 chemical list and 153 from the NATA chemical list. Chemicals missing genotoxicity information are noted.
ccl4[!(dtxsid %in% ccl4_genotox$dtxsid),
.(dtxsid, casrn, preferredName, molFormula)]
#> dtxsid casrn preferredName molFormula
#> <char> <char> <char> <char>
#> 1: DTXSID001024118 77238-39-2 Microcystin <NA>
#> 2: DTXSID0024052 55290-64-7 Dimethipin C6H10O4S2
#> 3: DTXSID0032578 59669-26-0 Thiodicarb C10H18N4O4S3
#> 4: DTXSID1037484 194992-44-4 Acetochlor OA C14H19NO4
#> 5: DTXSID1037486 171262-17-2 2-[(2,6-Diethylphenyl)(me C14H19NO4
#> 6: DTXSID1037567 171118-09-5 Metolachlor ESA C15H23NO5S
#> 7: DTXSID2022333 135-98-8 sec-Butylbenzene C10H14
#> 8: DTXSID2031083 143545-90-8 Cylindrospermopsin C15H21N5O7S
#> 9: DTXSID2037506 16655-82-6 3-Hydroxycarbofuran C12H15NO4
#> 10: DTXSID2052156 517-09-9 Equilenin C18H18O2
#> 11: DTXSID3021857 25154-52-3 n-Nonylphenol C15H24O
#> 12: DTXSID3034458 99129-21-2 Clethodim C17H26ClNO3S
#> 13: DTXSID3042219 103-65-1 Propylbenzene C9H12
#> 14: DTXSID3073137 14866-68-3 Chlorate ClO3
#> 15: DTXSID3074313 35523-89-8 Saxitoxin C10H17N7O4
#> 16: DTXSID4022448 51218-45-2 Metolachlor C15H22ClNO2
#> 17: DTXSID4032611 13194-48-4 Ethoprop C8H19O2PS2
#> 18: DTXSID4034948 112410-23-8 Tebufenozide C22H28N2O2
#> 19: DTXSID50867064 64285-06-9 Anatoxin a C10H15NO
#> 20: DTXSID6024177 10265-92-6 Methamidophos C2H8NO2PS
#> 21: DTXSID6037483 187022-11-3 Acetochlor ESA C14H21NO5S
#> 22: DTXSID6037485 142363-53-9 Alachlor ESA C14H21NO5S
#> 23: DTXSID6037568 152019-73-3 Metolachlor OA C15H21NO4
#> 24: DTXSID7024241 42874-03-3 Oxyfluorfen C15H11ClF3NO4
#> 25: DTXSID7047433 474-86-2 Equilin C18H20O2
#> 26: DTXSID8022377 57-91-0 17alpha-Estradiol C18H24O2
#> 27: DTXSID8052483 7440-56-4 Germanium Ge
#> 28: DTXSID9032113 107534-96-3 Tebuconazole C16H22ClN3O
#> 29: DTXSID9032329 741-58-2 Bensulide C14H24NO4PS3
#> dtxsid casrn preferredName molFormula
natadb[!(dtxsid %in% natadb_genetox$dtxsid),
.(dtxsid, casrn, preferredName, molFormula)]
#> dtxsid casrn preferredName molFormula
#> <char> <char> <char> <char>
#> 1: DTXSID00872421 NOCAS_872421 Lead & Lead Compounds <NA>
#> 2: DTXSID1020273 7782-50-5 Chlorine Cl2
#> 3: DTXSID10872417 NOCAS_872417 Cadmium & Cadmium Compoun <NA>
#> 4: DTXSID30872414 NOCAS_872414 Antimony & Antimony Compo <NA>
#> 5: DTXSID30872419 NOCAS_872419 Cobalt & Cobalt Compounds <NA>
#> 6: DTXSID40872425 NOCAS_872425 Nickel & Nickel Compounds <NA>
#> 7: DTXSID5024267 1336-36-3 Polychlorinated biphenyls <NA>
#> 8: DTXSID7020687 608-73-1 1,2,3,4,5,6-Hexachlorocyc C6H6Cl6
#> 9: DTXSID7023984 NOCAS_23984 Coke oven emissions <NA>
#> 10: DTXSID90872415 NOCAS_872415 Arsenic & Arsenic Compoun <NA>
Now, genotoxicity details of the chemicals in each chemical list are
returned using the function
get_genetox_details_batch()
.
ccl4_genetox_details <- get_genetox_details_batch(DTXSID = ccl4$dtxsid)
natadb_genetox_details <- get_genetox_details_batch(DTXSID = natadb$dtxsid)
If inspecting the first chemical in each set of results, DTXSID0020153, notice that the information is identical in each case as this information is chemical specific and not chemical list specific.
identical(ccl4_genetox_details[dtxsid %in% 'DTXSID0020153', ],
natadb_genetox_details[dtxsid %in% 'DTXSID0020153', ])
#> [1] TRUE
Assays present for chemicals in each chemical list can be explored.
First, determine the unique values of the assayCategory
column and then group by these values and determine the number of unique
assays for each assayCategory
value.
ccl4_genetox_details[, unique(assayCategory)]
#> [1] "in vitro" "ND" "in vivo"
natadb_genetox_details[, unique(assayCategory)]
#> [1] "in vitro" "ND" "in vivo"
ccl4_genetox_details[, unique(assayType)]
#> [1] "InVivoMN"
#> [2] "Overall"
#> [3] "bacterial reverse mutation assay"
#> [4] "micronucleus assay"
#> [5] "Ames"
#> [6] "InVitroCA"
#> [7] "InVitroMLA"
#> [8] "InVitroMN"
#> [9] "Cell transformation, clonal assay"
#> [10] "Forward and reverse gene mutation, host-mediated assay"
#> [11] "Histidine reverse gene mutation, Ames assay"
#> [12] "Micronucleus test, chromosome aberrations"
#> [13] "Mitotic recombination or gene conversion"
#> [14] "Rec-assay, DNA effects (bacterial DNA repair)"
#> [15] "Rec-assay, spot test, DNA effects (bacterial DNA repair)"
#> [16] "Sister-chromatid exchange (SCE) in vitro"
#> [17] "Unscheduled DNA synthesis (UDS) in vitro, DNA effects"
#> [18] "In vivo carcinogenicity studies"
#> [19] "in vitro mammalian chromosome aberration test"
#> [20] "mammalian cell gene mutation assay"
#> [21] "DNA damage and repair assay, unscheduled DNA synthesis in mammalian cells in vitro"
#> [22] "in vivo micronucleus (mouse)"
#> [23] "in vivo micronucleus (rat)"
#> [24] "Sperm morphology"
#> [25] "InVivoCA"
#> [26] "InVivoUDS"
#> [27] "transgenic"
#> [28] "Chromosome aberrations"
#> [29] "Forward gene mutation at the HPRT locus"
#> [30] "Heritable translocation test, chromosome aberrations"
#> [31] "Reverse gene mutation"
#> [32] "Sex-linked recessive lethal gene mutation"
#> [33] "Sister-chromatid exchange (SCE) in vivo"
#> [34] "Dominant lethal test"
#> [35] "Unscheduled DNA synthesis (UDS) in vivo; DNA effects"
#> [36] "chromosome aberration assay"
#> [37] "mammalian germ cell cytogenetic assay"
#> [38] "bacterial forward mutation assay"
#> [39] "sister chromatid exchange assay in mammalian cells"
#> [40] "DNA Binding"
#> [41] "rodent dominant lethal assay"
#> [42] "unscheduled DNA synthesis"
#> [43] "Bacterial Mutagenesis"
#> [44] "Cytogenetics Other"
#> [45] "Cytotoxicity"
#> [46] "In Vitro Micronucleus"
#> [47] "bacterial gene mutation assay"
#> [48] "in vitro mammalian cell micronucleus test"
#> [49] "Aneuploidy, chromosome aberrations"
#> [50] "Chromosome aberrations in vivo"
#> [51] "sister chromatid exchange assay"
#> [52] "InVivoDNADamage"
#> [53] "Cell transformation, viral enhanced"
#> [54] "combined chromosome aberration and micronucleus assay"
#> [55] "Chromosome aberrations in vitro"
#> [56] "Forward gene mutation"
#> [57] "Forward gene mutation at the HPRT or ouabain locus"
#> [58] "Forward gene mutation at the thymidine kinase (TK) locus; chromosome aberrations"
#> [59] "Specific locus test, gene mutation"
#> [60] "Spot test, gene mutation"
#> [61] "In Vivo Non-mammalian Mutagenesis"
#> [62] "In Vivo Micronucleus"
#> [63] "mouse spot test"
#> [64] "transgenic rodent mutagenicity assay"
#> [65] "yeast cytogenetic assay"
#> [66] "Micronucleus and sister chromatid exchange"
#> [67] "in vivo comet (mouse)"
#> [68] "in vivo comet (rat)"
#> [69] "Gene mutation"
#> [70] "in vitro mammalian cell transformation assay"
#> [71] "Cell transformation"
#> [72] "Tryptophan reverse gene mutation"
#> [73] "Cell Transformation"
#> [74] "DNA Damage/Repair"
#> [75] "In Vitro Chromosome Aberration"
#> [76] "Mutation"
#> [77] "DNA Covalent Binding"
#> [78] "In Vivo Chromosome Aberration"
#> [79] "In Vivo Mammalian Mutagenesis"
#> [80] "in vitro chromosomal aberration study in mammalian cells"
#> [81] "Mutation Other"
#> [82] "Evaluation of metabolic activity of acute cytotoxicity"
#> [83] "In vitro mammalian chromosomal aberration test"
#> [84] "Forward and reverse gene mutation, body fluid assay"
#> [85] "Forward and reverse gene mutation, chromosome aberrations, mitotic recombination and gene conversion, DNA effects, host-mediated assay"
#> [86] "Chromosomal aberration assay"
#> [87] "Mitotic recombination"
#> [88] "Aneuploidy, sex chromosome gain, chromosome aberrations"
#> [89] "Aneuploidy, whole sex chromosome loss, chromosome aberrations"
#> [90] "fluctuation test"
natadb_genetox_details[, unique(assayType)]
#> [1] "InVivoMN"
#> [2] "Overall"
#> [3] "bacterial reverse mutation assay"
#> [4] "micronucleus assay"
#> [5] "Ames"
#> [6] "InVitroCA"
#> [7] "InVitroMLA"
#> [8] "InVitroMN"
#> [9] "Cell transformation, clonal assay"
#> [10] "Forward and reverse gene mutation, host-mediated assay"
#> [11] "Histidine reverse gene mutation, Ames assay"
#> [12] "Micronucleus test, chromosome aberrations"
#> [13] "Mitotic recombination or gene conversion"
#> [14] "Rec-assay, DNA effects (bacterial DNA repair)"
#> [15] "Rec-assay, spot test, DNA effects (bacterial DNA repair)"
#> [16] "Sister-chromatid exchange (SCE) in vitro"
#> [17] "Unscheduled DNA synthesis (UDS) in vitro, DNA effects"
#> [18] "In vivo carcinogenicity studies"
#> [19] "DNA damage and repair assay, unscheduled DNA synthesis in mammalian cells in vitro"
#> [20] "rodent dominant lethal assay"
#> [21] "Chromosome aberrations"
#> [22] "Gene mutation"
#> [23] "InVivoUDS"
#> [24] "InVivoCA"
#> [25] "transgenic"
#> [26] "Forward gene mutation at the HPRT locus"
#> [27] "Heritable translocation test, chromosome aberrations"
#> [28] "Reverse gene mutation"
#> [29] "Sex-linked recessive lethal gene mutation"
#> [30] "Sister-chromatid exchange (SCE) in vivo"
#> [31] "Dominant lethal test"
#> [32] "Unscheduled DNA synthesis (UDS) in vivo; DNA effects"
#> [33] "Bacterial Mutagenesis"
#> [34] "Cytogenetics Other"
#> [35] "Cytotoxicity"
#> [36] "DNA Damage/Repair"
#> [37] "In Vitro Chromosome Aberration"
#> [38] "In Vitro Micronucleus"
#> [39] "In Vivo Non-mammalian Mutagenesis"
#> [40] "Mutation"
#> [41] "In Vivo Chromosome Aberration"
#> [42] "In Vivo Mammalian Mutagenesis"
#> [43] "In Vivo Micronucleus"
#> [44] "in vitro mammalian chromosome aberration test"
#> [45] "InVivoDNADamage"
#> [46] "Cell transformation, viral enhanced"
#> [47] "mammalian cell gene mutation assay"
#> [48] "in vivo micronucleus (mouse)"
#> [49] "Sperm morphology"
#> [50] "Forward and reverse gene mutation, mitotic recombination and gene conversion, host-mediated assay"
#> [51] "Spot test, gene mutation"
#> [52] "bacterial forward mutation assay"
#> [53] "sister chromatid exchange assay in mammalian cells"
#> [54] "DNA Binding"
#> [55] "unscheduled DNA synthesis"
#> [56] "bacteriophage induction in E. coli, gene mutation, UDS in mammalian cells, sex-linked recessive lethal mutations in Drosophila"
#> [57] "DNA damage, gene mutation, reverse mutation, gene conversion, DNA repair, chromosomal aberration, chromatid exchange, UDS"
#> [58] "chromosome aberration study in mammalian cells"
#> [59] "in vitro mammalian cell transformation assay"
#> [60] "Forward gene mutation at the thymidine kinase (TK) locus; chromosome aberrations"
#> [61] "Cell transformation"
#> [62] "Forward and reverse gene mutation, body fluid assay"
#> [63] "Forward gene mutation at the HPRT or ouabain locus"
#> [64] "chromosome aberration assay"
#> [65] "Drosophila SLRL assay"
#> [66] "Salmonella and Escherichia strains: bacterial reverse mutation assay (e.g. Ames test) ; Bacillus strains: recombination assay"
#> [67] "Cytogenetic assay in bone marrow cells"
#> [68] "in vivo comet (mouse)"
#> [69] "Chromosome aberrations in vitro"
#> [70] "Forward gene mutation"
#> [71] "Chromosome aberrations in vivo"
#> [72] "in vitro mammalian cell gene mutation tests using the thymidine kinase gene"
#> [73] "in vivo comet (rat)"
#> [74] "in vivo micronucleus (rat)"
#> [75] "mouse spot test"
#> [76] "Aneuploidy, whole sex chromosome loss, chromosome aberrations"
#> [77] "sister chromatid exchange assay"
#> [78] "Mouse Lymphoma Forward Mutation Assay"
#> [79] "mammalian erythrocyte micronucleus test"
#> [80] "Tryptophan reverse gene mutation"
#> [81] "bacterial gene mutation assay"
#> [82] "yeast forward mutation and mitotic gene conversion assays in Schizosaccharomyces pombe (P1 strain) and Saccharomyces cerevisiae (D4 strain)"
#> [83] "Micronucleus test in vitro, chromosome aberrations"
#> [84] "heritable translocation assay"
#> [85] "mitotic recombination assay with Saccharomyces cerevisiae"
#> [86] "Aneuploidy, chromosome aberrations"
#> [87] "cell transformation"
#> [88] "in vitro mammalian cell micronucleus test"
#> [89] "somatic mutation and recombination test in Drosophila"
#> [90] "transgenic rodent mutagenicity assay"
#> [91] "yeast cytogenetic assay"
#> [92] "Micronucleus and sister chromatid exchange"
#> [93] "in vitro mammalian cell gene mutation test using the Hprt and xprt genes"
#> [94] "bone marrow chromosome aberration assay and mammalian germ cell cytogenetic assay"
#> [95] "bacterial mutation"
#> [96] "bacterial reverse mutation assay (Salmonella typhimurium and Escherichia coli)"
#> [97] "Aneuploidy, partial sex chromosome loss, chromosome aberrations "
#> [98] "Chromosome aberrations, in vivo"
#> [99] "in vitro chromosome aberration study"
#> [100] "Cell transformation, focus assay"
#> [101] "Forward and reverse gene mutation, mitotic recombination and gene conversion, DNA effects, host-mediated assay"
#> [102] "gene mutation assay in fungi"
#> [103] "DNA adduct formation"
#> [104] "Cell Transformation"
#> [105] "DNA Covalent Binding"
#> [106] "mammalian comet assay"
#> [107] "Aneuploidy, sex chromosome gain, chromosome aberrations"
#> [108] "mammalian germ cell cytogenetic assay"
#> [109] "Forward and reverse gene mutation, chromosome aberrations, mitotic recombination and gene conversion, DNA effects, host-mediated assay"
#> [110] "E. coli K-12 DNA repair host-mediated assay"
#> [111] "Chromosomal aberration assay"
#> [112] "forward mutation"
#> [113] "mammalian cell gene mutation test"
#> [114] "Mitotic recombination"
Next, determine the number of assays per unique
assayCategory
value, count the number of assay results and
grouping by assayCategory
, and assayType
, and
also examine the different numbers of assayCategory
and
assayTypes
values used for both chemical lists.
ccl4_genetox_details[, .(Assays = length(unique(assayType))),
by = .(assayCategory)]
#> assayCategory Assays
#> <char> <int>
#> 1: in vitro 65
#> 2: ND 3
#> 3: in vivo 22
natadb_genetox_details[, .(Assays = length(unique(assayType))),
by = .(assayCategory)]
#> assayCategory Assays
#> <char> <int>
#> 1: in vitro 83
#> 2: ND 3
#> 3: in vivo 28
ccl4_genetox_details[, .N, by = .(assayCategory, assayType, assayResult)]
#> assayCategory assayType assayResult N
#> <char> <char> <char> <int>
#> 1: in vitro InVivoMN negative 10
#> 2: ND Overall positive 5
#> 3: in vitro bacterial reverse mutatio positive 39
#> 4: in vivo micronucleus assay negative 36
#> 5: in vivo micronucleus assay equivocal 1
#> ---
#> 149: in vitro Heritable translocation t negative 1
#> 150: in vitro Mitotic recombination positive 1
#> 151: in vitro Aneuploidy, sex chromosom negative 1
#> 152: in vitro Aneuploidy, whole sex chr positive 1
#> 153: in vitro fluctuation test negative 2
ccl4_genetox_details[, .N, by = .(assayCategory)]
#> assayCategory N
#> <char> <int>
#> 1: in vitro 815
#> 2: ND 38
#> 3: in vivo 188
ccl4_genetox_details[assayCategory == 'in vitro', .N, by = .(assayType)]
#> assayType N
#> <char> <int>
#> 1: InVivoMN 28
#> 2: bacterial reverse mutatio 165
#> 3: Ames 88
#> 4: InVitroCA 31
#> 5: InVitroMLA 24
#> 6: InVitroMN 6
#> 7: Cell transformation, clon 8
#> 8: Forward and reverse gene 7
#> 9: Histidine reverse gene mu 19
#> 10: Micronucleus test, chromo 8
#> 11: Mitotic recombination or 18
#> 12: Rec-assay, DNA effects (b 15
#> 13: Rec-assay, spot test, DNA 2
#> 14: Sister-chromatid exchange 41
#> 15: in vitro mammalian chromo 22
#> 16: mammalian cell gene mutat 42
#> 17: DNA damage and repair ass 21
#> 18: Chromosome aberrations 2
#> 19: Forward gene mutation at 5
#> 20: Heritable translocation t 5
#> 21: Reverse gene mutation 9
#> 22: Sex-linked recessive leth 9
#> 23: Sister-chromatid exchange 13
#> 24: chromosome aberration ass 15
#> 25: bacterial forward mutatio 1
#> 26: sister chromatid exchange 11
#> 27: Bacterial Mutagenesis 27
#> 28: Cytogenetics Other 26
#> 29: Cytotoxicity 21
#> 30: In Vitro Micronucleus 4
#> 31: bacterial gene mutation a 7
#> 32: in vitro mammalian cell m 5
#> 33: Aneuploidy, chromosome ab 5
#> 34: sister chromatid exchange 3
#> 35: Cell transformation, vira 12
#> 36: combined chromosome aberr 1
#> 37: Chromosome aberrations in 2
#> 38: Forward gene mutation 5
#> 39: Forward gene mutation at 6
#> 40: Forward gene mutation at 2
#> 41: Specific locus test, gene 1
#> 42: Spot test, gene mutation 1
#> 43: In Vivo Non-mammalian Mut 7
#> 44: mouse spot test 2
#> 45: transgenic rodent mutagen 1
#> 46: yeast cytogenetic assay 1
#> 47: Gene mutation 4
#> 48: in vitro mammalian cell t 2
#> 49: Cell transformation 5
#> 50: Tryptophan reverse gene m 8
#> 51: Cell Transformation 2
#> 52: DNA Damage/Repair 8
#> 53: In Vitro Chromosome Aberr 11
#> 54: Mutation 3
#> 55: in vitro chromosomal aber 1
#> 56: Mutation Other 4
#> 57: Evaluation of metabolic a 1
#> 58: In vitro mammalian chromo 2
#> 59: Forward and reverse gene 2
#> 60: Forward and reverse gene 1
#> 61: Chromosomal aberration as 2
#> 62: Mitotic recombination 1
#> 63: Aneuploidy, sex chromosom 1
#> 64: Aneuploidy, whole sex chr 1
#> 65: fluctuation test 2
#> assayType N
ccl4_genetox_details[assayCategory == 'ND', .N, by = .(assayType)]
#> assayType N
#> <char> <int>
#> 1: Overall 5
#> 2: In vivo carcinogenicity s 23
#> 3: transgenic 10
ccl4_genetox_details[assayCategory == 'in vivo', .N, by = .(assayType)]
#> assayType N
#> <char> <int>
#> 1: micronucleus assay 46
#> 2: Unscheduled DNA synthesis 9
#> 3: in vivo micronucleus (mou 19
#> 4: in vivo micronucleus (rat 9
#> 5: Sperm morphology 9
#> 6: InVivoCA 14
#> 7: InVivoUDS 11
#> 8: Dominant lethal test 5
#> 9: Unscheduled DNA synthesis 3
#> 10: mammalian germ cell cytog 2
#> 11: DNA Binding 1
#> 12: rodent dominant lethal as 15
#> 13: unscheduled DNA synthesis 6
#> 14: Chromosome aberrations in 2
#> 15: InVivoDNADamage 7
#> 16: In Vivo Micronucleus 1
#> 17: Micronucleus and sister c 2
#> 18: in vivo comet (mouse) 1
#> 19: in vivo comet (rat) 3
#> 20: DNA Covalent Binding 12
#> 21: In Vivo Chromosome Aberra 4
#> 22: In Vivo Mammalian Mutagen 7
#> assayType N
natadb_genetox_details[, .N, by = .(assayCategory, assayType, assayResult)]
#> assayCategory assayType assayResult N
#> <char> <char> <char> <int>
#> 1: in vitro InVivoMN negative 40
#> 2: ND Overall positive 16
#> 3: in vitro bacterial reverse mutatio positive 93
#> 4: in vivo micronucleus assay negative 76
#> 5: in vivo micronucleus assay equivocal 4
#> ---
#> 194: in vitro Heritable translocation t negative 2
#> 195: in vivo mammalian comet assay equivocal 1
#> 196: in vitro mammalian cell gene mutat positive 1
#> 197: in vitro in vitro mammalian cell t positive 1
#> 198: in vitro Mitotic recombination positive 1
natadb_genetox_details[, .N, by = .(assayCategory)]
#> assayCategory N
#> <char> <int>
#> 1: in vitro 2112
#> 2: ND 100
#> 3: in vivo 435
natadb_genetox_details[assayCategory == 'in vitro', .N, by = .(assayType)]
#> assayType N
#> <char> <int>
#> 1: InVivoMN 89
#> 2: bacterial reverse mutatio 362
#> 3: Ames 258
#> 4: InVitroCA 98
#> 5: InVitroMLA 85
#> 6: InVitroMN 20
#> 7: Cell transformation, clon 14
#> 8: Forward and reverse gene 17
#> 9: Histidine reverse gene mu 55
#> 10: Micronucleus test, chromo 33
#> 11: Mitotic recombination or 47
#> 12: Rec-assay, DNA effects (b 34
#> 13: Rec-assay, spot test, DNA 6
#> 14: Sister-chromatid exchange 98
#> 15: DNA damage and repair ass 50
#> 16: Chromosome aberrations 25
#> 17: Gene mutation 20
#> 18: Forward gene mutation at 12
#> 19: Heritable translocation t 10
#> 20: Reverse gene mutation 30
#> 21: Sex-linked recessive leth 26
#> 22: Sister-chromatid exchange 31
#> 23: Bacterial Mutagenesis 41
#> 24: Cytogenetics Other 41
#> 25: Cytotoxicity 20
#> 26: DNA Damage/Repair 27
#> 27: In Vitro Chromosome Aberr 6
#> 28: In Vitro Micronucleus 8
#> 29: In Vivo Non-mammalian Mut 7
#> 30: Mutation 6
#> 31: in vitro mammalian chromo 91
#> 32: Cell transformation, vira 46
#> 33: mammalian cell gene mutat 104
#> 34: Forward and reverse gene 4
#> 35: Spot test, gene mutation 4
#> 36: bacterial forward mutatio 4
#> 37: sister chromatid exchange 50
#> 38: bacteriophage induction i 1
#> 39: DNA damage, gene mutation 1
#> 40: chromosome aberration stu 1
#> 41: in vitro mammalian cell t 2
#> 42: Forward gene mutation at 6
#> 43: Cell transformation 11
#> 44: Forward and reverse gene 7
#> 45: Forward gene mutation at 10
#> 46: chromosome aberration ass 24
#> 47: Drosophila SLRL assay 20
#> 48: Salmonella and Escherichi 4
#> 49: Cytogenetic assay in bone 1
#> 50: Chromosome aberrations in 7
#> 51: Forward gene mutation 18
#> 52: in vitro mammalian cell g 2
#> 53: mouse spot test 7
#> 54: Aneuploidy, whole sex chr 4
#> 55: sister chromatid exchange 7
#> 56: Mouse Lymphoma Forward Mu 1
#> 57: Tryptophan reverse gene m 18
#> 58: bacterial gene mutation a 11
#> 59: yeast forward mutation an 4
#> 60: Micronucleus test in vitr 2
#> 61: mitotic recombination ass 6
#> 62: Aneuploidy, chromosome ab 8
#> 63: cell transformation 2
#> 64: in vitro mammalian cell m 13
#> 65: somatic mutation and reco 3
#> 66: transgenic rodent mutagen 2
#> 67: yeast cytogenetic assay 2
#> 68: in vitro mammalian cell g 2
#> 69: bacterial mutation 1
#> 70: bacterial reverse mutatio 5
#> 71: Aneuploidy, partial sex c 2
#> 72: in vitro chromosome aberr 1
#> 73: Cell transformation, focu 2
#> 74: Forward and reverse gene 1
#> 75: gene mutation assay in fu 5
#> 76: Cell Transformation 1
#> 77: Aneuploidy, sex chromosom 1
#> 78: Forward and reverse gene 1
#> 79: E. coli K-12 DNA repair h 1
#> 80: Chromosomal aberration as 2
#> 81: forward mutation 1
#> 82: mammalian cell gene mutat 1
#> 83: Mitotic recombination 1
#> assayType N
natadb_genetox_details[assayCategory == 'ND', .N, by = .(assayType)]
#> assayType N
#> <char> <int>
#> 1: Overall 16
#> 2: In vivo carcinogenicity s 66
#> 3: transgenic 18
natadb_genetox_details[assayCategory == 'in vivo', .N, by = .(assayType)]
#> assayType N
#> <char> <int>
#> 1: micronucleus assay 105
#> 2: Unscheduled DNA synthesis 27
#> 3: rodent dominant lethal as 31
#> 4: InVivoUDS 33
#> 5: InVivoCA 37
#> 6: Dominant lethal test 14
#> 7: Unscheduled DNA synthesis 5
#> 8: In Vivo Chromosome Aberra 5
#> 9: In Vivo Mammalian Mutagen 6
#> 10: In Vivo Micronucleus 11
#> 11: InVivoDNADamage 23
#> 12: in vivo micronucleus (mou 51
#> 13: Sperm morphology 25
#> 14: DNA Binding 1
#> 15: unscheduled DNA synthesis 18
#> 16: in vivo comet (mouse) 4
#> 17: Chromosome aberrations in 9
#> 18: in vivo comet (rat) 3
#> 19: in vivo micronucleus (rat 9
#> 20: mammalian erythrocyte mic 2
#> 21: heritable translocation a 2
#> 22: Micronucleus and sister c 2
#> 23: bone marrow chromosome ab 1
#> 24: Chromosome aberrations, i 2
#> 25: DNA adduct formation 1
#> 26: DNA Covalent Binding 1
#> 27: mammalian comet assay 6
#> 28: mammalian germ cell cytog 1
#> assayType N
Observe that there are 90 unique assays for CCl4 and 114 unique assays for NATADB. The different assay categories are “in vitro”, “ND”, and “in vivo”, with 65 unique “in vitro” assays for CCl4 and 83 for NATADB, 3 unique “ND” assays for CCL4 and 3 for NATADB, and 22 unique “in vivo” assays for CCL4 and 28 for NATADB.
One may be interested in looking at the number of chemicals for which
an assay resulted in a positive or negative result. To assess this,
group by assayResult
and determine the number of unique
dtxsid
values associated with each assayResult
value.
ccl4_genetox_details[, .(DTXSIDs = length(unique(dtxsid))), by = .(assayResult)]
#> assayResult DTXSIDs
#> <char> <int>
#> 1: negative 64
#> 2: positive 53
#> 3: equivocal 15
natadb_genetox_details[, .(DTXSIDs = length(unique(dtxsid))),
by = .(assayResult)]
#> assayResult DTXSIDs
#> <char> <int>
#> 1: negative 141
#> 2: positive 130
#> 3: equivocal 48
For CCL4, there are 64 unique chemicals that have a negative assay
result, 53 that have a positive result, and 15 that have an equivocal
result. For NATADB, there are 141 unique chemicals that have a negative
assay result, 130 that have a positive result, and 48 that have an
equivocal result. Observe that since there are 72 unique
dtxsid
values with assay results in CCL4 and 153 in NATADB,
there are several chemicals that have multiple assay results.
Next, determine the chemicals from each chemical list that are known
to have genotoxic effects. For this, examine which chemicals produce at
least one positive response in the assayResult
column.
ccl4_genetox_details[, .(is_positive = any(assayResult == 'positive')),
by = .(dtxsid)][is_positive == TRUE, dtxsid]
#> [1] "DTXSID0020153" "DTXSID0020573" "DTXSID0020600" "DTXSID0020814" "DTXSID0021464" "DTXSID0021541"
#> [7] "DTXSID0024341" "DTXSID1021407" "DTXSID1021740" "DTXSID1021798" "DTXSID1024338" "DTXSID1026164"
#> [13] "DTXSID1031040" "DTXSID2021028" "DTXSID2021317" "DTXSID2021731" "DTXSID3020203" "DTXSID3020702"
#> [19] "DTXSID3020833" "DTXSID3024869" "DTXSID3031864" "DTXSID4020533" "DTXSID4021503" "DTXSID4022361"
#> [25] "DTXSID4022367" "DTXSID5020023" "DTXSID5020576" "DTXSID5020601" "DTXSID5021207" "DTXSID5024182"
#> [31] "DTXSID5039224" "DTXSID6020301" "DTXSID6021030" "DTXSID6021032" "DTXSID6022422" "DTXSID7020005"
#> [37] "DTXSID7020215" "DTXSID7020637" "DTXSID7021029" "DTXSID8020044" "DTXSID8020090" "DTXSID8020832"
#> [43] "DTXSID8021062" "DTXSID8023846" "DTXSID8023848" "DTXSID8025541" "DTXSID8031865" "DTXSID9020243"
#> [49] "DTXSID9021390" "DTXSID9021427" "DTXSID9022366" "DTXSID9023380" "DTXSID9023914"
natadb_genetox_details[, .(is_positive = any(assayResult == 'positive')),
by = .(dtxsid)][is_positive == TRUE, dtxsid]
#> [1] "DTXSID0020153" "DTXSID0020448" "DTXSID0020523" "DTXSID0020529" "DTXSID0020600"
#> [6] "DTXSID0020868" "DTXSID0021381" "DTXSID0021383" "DTXSID0021541" "DTXSID0021834"
#> [11] "DTXSID0021965" "DTXSID0024187" "DTXSID0039227" "DTXSID0039229" "DTXSID1020148"
#> [16] "DTXSID1020302" "DTXSID1020306" "DTXSID1020431" "DTXSID1020512" "DTXSID1020516"
#> [21] "DTXSID1020566" "DTXSID1021374" "DTXSID1021798" "DTXSID1021827" "DTXSID1022057"
#> [26] "DTXSID1023786" "DTXSID1024045" "DTXSID1026164" "DTXSID1049641" "DTXSID2020137"
#> [31] "DTXSID2020262" "DTXSID2020507" "DTXSID2020682" "DTXSID2020844" "DTXSID2021284"
#> [36] "DTXSID2021286" "DTXSID2021319" "DTXSID2021658" "DTXSID2021731" "DTXSID2021781"
#> [41] "DTXSID3020203" "DTXSID3020257" "DTXSID3020413" "DTXSID3020415" "DTXSID3020596"
#> [46] "DTXSID3020679" "DTXSID3020702" "DTXSID3020833" "DTXSID3021431" "DTXSID3025091"
#> [51] "DTXSID3039242" "DTXSID4020161" "DTXSID4020298" "DTXSID4020402" "DTXSID4020533"
#> [56] "DTXSID4020583" "DTXSID4020874" "DTXSID4020901" "DTXSID4021006" "DTXSID4021056"
#> [61] "DTXSID4021395" "DTXSID4039231" "DTXSID5020023" "DTXSID5020027" "DTXSID5020029"
#> [66] "DTXSID5020071" "DTXSID5020316" "DTXSID5020449" "DTXSID5020491" "DTXSID5020601"
#> [71] "DTXSID5020607" "DTXSID5020865" "DTXSID5021124" "DTXSID5021207" "DTXSID5021380"
#> [76] "DTXSID5021386" "DTXSID5024055" "DTXSID5024059" "DTXSID5039224" "DTXSID6020145"
#> [81] "DTXSID6020307" "DTXSID6020353" "DTXSID6020432" "DTXSID6020438" "DTXSID6020515"
#> [86] "DTXSID6020569" "DTXSID6020981" "DTXSID6021828" "DTXSID6022422" "DTXSID6023947"
#> [91] "DTXSID6023949" "DTXSID7020005" "DTXSID7020009" "DTXSID7020267" "DTXSID7020637"
#> [96] "DTXSID7020689" "DTXSID7020710" "DTXSID7020716" "DTXSID7021029" "DTXSID7021100"
#> [101] "DTXSID7021106" "DTXSID7021318" "DTXSID7021360" "DTXSID7021368" "DTXSID7021948"
#> [106] "DTXSID7024166" "DTXSID7024370" "DTXSID7024532" "DTXSID7025180" "DTXSID7026156"
#> [111] "DTXSID8020090" "DTXSID8020173" "DTXSID8020250" "DTXSID8020599" "DTXSID8020759"
#> [116] "DTXSID8020832" "DTXSID8021195" "DTXSID8021197" "DTXSID8021432" "DTXSID8021434"
#> [121] "DTXSID8021438" "DTXSID8024286" "DTXSID9020168" "DTXSID9020243" "DTXSID9020247"
#> [126] "DTXSID9020293" "DTXSID9020827" "DTXSID9021138" "DTXSID9021261" "DTXSID9041522"
Given the amount of genotoxicity data, consider one chemical,
DTXSID0020153, to get a sense of the assays, the number of each type of
result, and which correspond to “positive” results. To determine this,
group by assayResult
and calculate .N
for each
group. We also isolate which were positive and output a data.table with
the number of each type.
ccl4_genetox_details[dtxsid == 'DTXSID0020153', .(Number = .N),
by = .(assayResult)]
#> assayResult Number
#> <char> <int>
#> 1: negative 5
#> 2: positive 22
#> 3: equivocal 1
ccl4_genetox_details[dtxsid == 'DTXSID0020153' & assayResult == 'positive',
.(Number_of_assays = .N), by = .(assayType)][order(-Number_of_assays),]
#> assayType Number_of_assays
#> <char> <int>
#> 1: bacterial reverse mutatio 3
#> 2: Ames 3
#> 3: InVitroCA 2
#> 4: InVitroMLA 2
#> 5: Rec-assay, DNA effects (b 2
#> 6: Sister-chromatid exchange 2
#> 7: Overall 1
#> 8: InVitroMN 1
#> 9: Cell transformation, clon 1
#> 10: Histidine reverse gene mu 1
#> 11: Mitotic recombination or 1
#> 12: Rec-assay, spot test, DNA 1
#> 13: Unscheduled DNA synthesis 1
#> 14: In vivo carcinogenicity s 1
There were five assays that produced a negative result, 22 that produced a positive result, and one that produced an equivocal result. Of the 22 positive assays, “bacterial reverse mutation assay” and “Ames” were the most numerous, with three each.
Review Hazard Data for Chemical Lists
Hazard data associated with the chemicals in each chemical list can be retrieved. For each chemical, there are potentially hundreds of rows of hazard data, so the returned results will be much larger than in most other API endpoints. We model how one would structure such a query of the chemicals in CCL4 and NATADB, and leave it to the reader to explore the data in a similar fashion to the previous examples.
ccl4_hazard <- get_hazard_by_dtxsid_batch(DTXSID = ccl4$dtxsid)
natadb_hazard <- get_hazard_by_dtxsid_batch(DTXSID = natadb$dtxsid)
Next, it may be helpful to examine the dimensions and column names of the output.
Conclusion
In this vignette, a variety of functions that access different types
of data found in the Hazard
endpoints of the CTX APIs were
explored. While this exploration was not exhaustive, it provides a basic
introduction to how one may access data and work with it. Additional
endpoints and corresponding functions exist and we encourage the user to
explore these while keeping in mind the examples contained in this
vignette.