Please send questions to wambaugh.john@epa.gov
from “Evaluation and calibration of high-throughput predictions of chemical distribution to tissues”
Robert G. Pearce, R. Woodrow Setzer, Jimena L. Davis,and John F. Wambaugh
Journal of Pharmacokinetics and Pharmacodynamics volume 44, pages549–565 (2017)
https://doi.org/10.1007/s10928-017-9548-7
Toxicokinetics (TK) provides critical information for integrating chemical toxicity and exposure assessments in order to determine potential chemical risk (i.e., the margin between toxic doses and plausible exposures). For thousands of chemicals that are present in our environment, in vivo TK data are lacking. The publicly available R package “httk” (version 1.8, named for “high throughput TK”) draws from a database of in vitro data and physico-chemical properties in order to run physiologically-based TK (PBTK) models for 553 compounds. The PBTK model parameters include tissue:plasma partition coefficients (Kp) which the httk software predicts using the model of Schmitt (Toxicol In Vitro 22 (2):457–467, 2008). In this paper we evaluated and modified httk predictions, and quantified confidence using in vivo literature data. We used 964 rat Kp measured by in vivo experiments for 143 compounds. Initially, predicted Kp were significantly larger than measured Kp for many lipophilic compounds (log10 octanol:water partition coefficient > 3). Hence the approach for predicting Kp was revised to account for possible deficiencies in the in vitro protein binding assay, and the method for predicting membrane affinity was revised. These changes yielded improvements ranging from a factor of 10 to nearly a factor of 10,000 for 83 Kp across 23 compounds with only 3 Kp worsening by more than a factor of 10. The vast majority (92%) of Kp were predicted within a factor of 10 of the measured value (overall root mean squared error of 0.59 on log10-transformed scale). After applying the adjustments, regressions were performed to calibrate and evaluate the predictions for 12 tissues. Predictions for some tissues (e.g., spleen, bone, gut, lung) were observed to be better than predictions for other tissues (e.g., skin, brain, fat), indicating that confidence in the application of in silico tools to predict chemical partitioning varies depending upon the tissues involved. Our calibrated model was then evaluated using a second data set of human in vivo measurements of volume of distribution (Vss) for 498 compounds reviewed by Obach et al. (Drug Metab Dispos 36(7):1385–1405, 2008). We found that calibration of the model improved performance: a regression of the measured values as a function of the predictions has a slope of 1.03, intercept of − 0.04, and R2 of 0.43. Through careful evaluation of predictive methods for chemical partitioning into tissues, we have improved and calibrated these methods and quantified confidence for TK predictions in humans and rats.
This vignette was created with httk v1.6. It was updated to httk v2.2.3 in April of 2023. Although we attempt to maintain backward compatibility, if you encounter issues with the latest release of httk and cannot easily address the changes, historical versions of httk are available from: https://cran.r-project.org/src/contrib/Archive/httk/
R package knitr generates html and PDF documents from this RMarkdown file, Each bit of code that follows is known as a “chunk”. We start by telling knitr how we want our chunks to look.
It is a bad idea to let variables and other information from previous R sessions float around, so we first remove everything in the R memory.
rm(list=ls())
If you are using the RMarkdown version of this vignette (extension, .RMD) you will be able to see that several chunks of code in this vignette have the statement “eval = execute.vignette”. The next chunk of code, by default, sets execute.vignette = FALSE. This means that the code is included (and necessary) but was not run when the vignette was built. We do this because some steps require extensive computing time and the checks on CRAN limit how long we can spend building the package. If you want this vignette to work, you must run all code, either by cutting and pasting it into R. Or, if viewing the .RMD file, you can either change execute.vignette to TRUE or press “play” (the green arrow) on each chunk in RStudio.
# Set whether or not the following chunks will be executed (run):
<- FALSE execute.vignette
We use the command ‘library()’ to load various R packages for our analysis. If you get the message “Error in library(X) : there is no package called ‘X’” then you will need to install that package:
From the R command prompt:
install.packages(“X”)
Or, if using RStudio, look for ‘Install Packages’ under ‘Tools’ tab.
library(httk)
library(gdata)
library(ggplot2)
library(viridis)
library(censReg)
library(gmodels)
library(gplots)
library(scales)
library(colorspace)
library(gridExtra)
<- function(x) {
scientific_10 <- gsub("1e", "10^", scientific_format()(x))
out <- gsub("\\+","",out)
out <- gsub("10\\^01","10",out)
out <- parse(text=gsub("10\\^00","1",out))
out }
We first filter the measured rat Kp data, pc.data. Then the old and new Kp predictions are made, along with error and improvement measures, and these are all consolidated into a table for analysis and plotting. Note that the final table contains log10-transformed values and error and improvements derived from subtracting these values. Only relevant rat values are used. Compounds with Funbound.plasma and partition coefficients of zero are removed as well as compounds with approximated Funbound.plasma values.
<- NULL
pc.table <- subset(pc.data,fu != 0 & Exp_PC != 0 & Tissue %in% c("Adipose","Bone","Brain","Gut",
pc.data "Heart","Kidney","Liver","Lung","Muscle","Skin","Spleen","Blood Cells") &
tolower(Species) == 'rat' & !CAS %in% c('10457-90-6','5786-21-0','17617-23-1','69-23-8','2898-12-6',
'57562-99-9','59-99-4','2955-38-6','155-97-5','41903-57-5','58-55-9','77-32-7','59-05-2','60-54-8'))
<- get_cheminfo(model='schmitt',species='rat',suppress.messages=TRUE)
cas.list <- cas.list[cas.list %in% pc.data[,'CAS']]
cas.list <- subset(chem.physical_and_invitro.data,!is.na(logMA))[,'CAS']
ma.data.list for(this.cas in cas.list){
<- parameterize_schmitt(
parameters chem.cas=this.cas,
species='rat',
suppress.messages=TRUE)
<- parameters
init.parameters <- calc_ionization(
charge chem.cas=this.cas,
pH=7.4)$fraction_charged
if(!this.cas %in% ma.data.list){
$MA <- 10^(0.999831 - 0.016578*38.7 + 0.881721 * log10(parameters$Pow))
init.parameters
}<- predict_partitioning_schmitt(
pcs parameters=parameters,
species='rat',
regression=FALSE,
suppress.messages=TRUE
)<- predict_partitioning_schmitt(
init.pcs parameters=init.parameters,
species='rat',
regression=FALSE,
suppress.messages=TRUE)
for(this.tissue in subset(pc.data,CAS==this.cas)[,'Tissue']){
if(this.tissue == 'Blood Cells') this.pc <- 'rbc'
else this.pc <- this.tissue
<- rbind(pc.table,
pc.table cbind(
as.data.frame(this.cas),
as.data.frame(this.tissue),
as.data.frame(log10(
which(substr(names(init.pcs),
init.pcs[[2,
nchar(names(init.pcs))-3) ==
tolower(this.pc))]] *
$Funbound.plasma)),
init.parametersas.data.frame(log10(
which(substr(names(pcs),
pcs[[2,
nchar(names(pcs))-3) ==
tolower(this.pc))]] *
$unadjusted.Funbound.plasma)),
parametersas.data.frame(log10(
which(substr(names(init.pcs),
init.pcs[[2,
nchar(names(init.pcs))-3) ==
tolower(this.pc))]] *
$unadjusted.Funbound.plasma)),
init.parametersas.data.frame(log10(
which(substr(names(pcs),
pcs[[2,
nchar(names(pcs))-3) == tolower(this.pc))]] *
$Funbound.plasma)),
parametersas.data.frame(log10(
subset(pc.data,
==this.cas & Tissue==this.tissue)[,'Exp_PC'])),
CASas.data.frame(subset(pc.data,
==this.cas & Tissue==this.tissue)[,'LogP']),
CASas.data.frame(charge),
as.data.frame(as.character(subset(pc.data,
== this.cas)[1,'A.B.N'])),
CAS as.data.frame(subset(pc.data,
== this.cas)[1,'fu'])))
CAS
}
}colnames(pc.table) <- c('CAS','Tissue',
'fup.correction',
'ma.correction',
'init.Predicted',
'Predicted',
'Experimental',
'logP',
'charge',
'type',
'fup')
<- pc.table[,'Experimental'] - pc.table[,'init.Predicted']
init.error <- pc.table[,'Experimental'] - pc.table[,'fup.correction']
fup.error <- pc.table[,'Experimental'] - pc.table[,'ma.correction']
ma.error <- pc.table[,'Experimental'] - pc.table[,'Predicted']
final.error <- abs(init.error) - abs(fup.error)
fup.improvement <- abs(init.error) - abs(ma.error)
ma.improvement <- abs(init.error) - abs(final.error)
final.improvement <- cbind(pc.table,fup.improvement,ma.improvement, final.improvement,
pc.table final.error,init.error,ma.error,fup.error)
<- ggplot() +
init.plot geom_point(data=pc.table,aes(10^(init.Predicted),10^(Experimental))) +
geom_abline() +
labs(y=expression(paste("Measured ",K[p])),
x=expression(paste("Predicted ",K[p]))) +
theme(axis.text=element_text(size=16),axis.title=element_text(size=16),
plot.title=element_text(size=18,hjust = 0.5)) +
scale_x_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
scale_y_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
ggtitle('(A)')
print(init.plot)
<- summary(lm(Experimental ~ init.Predicted,
init.stats data=pc.table))
<- ggplot() +
final.plot geom_point(data=pc.table,aes(10^(Predicted),10^(Experimental))) +
geom_abline() +
labs(y=expression(paste("Measured ",K[p])),
x=expression(paste("Predicted ",K[p]))) +
theme(axis.text=element_text(size=16),axis.title=element_text(size=16),
plot.title=element_text(size=18,hjust=0.5)) +
scale_x_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
scale_y_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
ggtitle('(B)')
print(final.plot)
<- summary(lm(Experimental ~ Predicted,
final.stats data=pc.table))
<- ggplot() +
fup.change.plot geom_point(data=pc.table[order(pc.table[,'fup.improvement'],decreasing=F),],
aes(10^(fup.correction),10^(Experimental),color=fup.improvement)) +
geom_abline() +
labs(y=expression(paste("Measured ",K[p])),
x=expression(paste("Predicted ",K[p])),color='Improvement') +
theme(axis.text=element_text(size=16),axis.title=element_text(size=16)) +
scale_x_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
scale_y_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
scale_color_viridis(direction=-1,option='inferno')
print(fup.change.plot)
<- summary(lm(Experimental ~ fup.correction,
fup.stats data=pc.table))
<- subset(pc.table,!CAS %in% ma.data.list)
ma.subset <- ggplot() +
ma.change.plot geom_point(data=ma.subset[order(ma.subset[,'ma.improvement']
decreasing=F),],
,aes(10^(ma.correction),10^(Experimental),color=ma.improvement)) +
geom_abline() +
labs(y=expression(paste("Measured ",K[p])),
x=expression(paste("Predicted ",K[p])),color='Improvement') +
theme(axis.text=element_text(size=16),axis.title=element_text(size=16)) +
scale_x_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
scale_y_log10(label=scientific_10,limits=c(0.01,10^4.5)) +
scale_color_viridis(direction=-1,option='inferno')
print(ma.change.plot)
<- summary(lm(Experimental ~ ma.correction,
ma.stats data=ma.subset))
<- data.frame(Test=c("Initial Tissue PC Accuracy","Fup Lipid Correction","Membrane Affinity","Final"),
fup.table RSquared = signif(c(init.stats$adj.r.squared,
$adj.r.squared,
fup.stats$adj.r.squared,
ma.stats$adj.r.squared),3),
final.statsRMSLE = signif(c(mean(init.stats$residuals^2,na.rm=TRUE)^(1/2),
mean(fup.stats$residuals^2,na.rm=TRUE)^(1/2),
mean(ma.stats$residuals^2,na.rm=TRUE)^(1/2),
mean(final.stats$residuals^2,na.rm=TRUE)^(1/2)),3)
)::kable(fup.table) knitr
Now we calculate and plot the regressions for all tissues, together with their 95% confidence intervals.
<- NULL
regressions
for(tissue in as.character(unique(pc.table[,'Tissue']))){
<- lm(Experimental ~ Predicted ,data=subset(pc.table,Tissue==tissue))
fit <- summary(fit)
smry <- estimable(fit, cm=diag(2), beta0=c(0,1), joint.test=TRUE)
est <- rbind(regressions,cbind(tissue,as.data.frame(fit$coefficients[['(Intercept)']]),
regressions as.data.frame(fit$coefficients[['Predicted']]),
as.data.frame(smry$coefficients[['Predicted','Pr(>|t|)']]),
as.data.frame(smry$sigma),as.data.frame(smry$r.squared),
as.data.frame(smry[[11]][1,1]),as.data.frame(smry[[11]][2,2]),
as.data.frame(smry[[11]][1,2]),as.data.frame(smry$df[2]),as.data.frame(est[[3]])))
}colnames(regressions) <- c('Tissue','Intercept','Slope','P-value','SE','R-squared',
'Int Var','Slp Var','Cov','df','estimable')
for (this.col in 2:10) regressions[,this.col] <- signif(regressions[,this.col], 3)
<- regressions[order(regressions[,1]),]
regressions
::kable(regressions, caption = "Table 1: The regressions for each tissue, after fup and membrane affinity adjustments, of the log10-transformed measured Kp regressed on
knitrpredicted Kp")
write.table(regressions,
file=paste0("Pearce2017PCCalibration=",Sys.Date(),".txt"),
sep="/t",
row.names=FALSE)
<- seq(-2,3.5,.01)
x.cf for(tissue in as.character(unique(pc.table[,'Tissue'])))
{<- qt(0.975,df=subset(regressions,Tissue==tissue)[['df']]+1) *
conf subset(regressions,Tissue==tissue)[['SE']] *
sqrt(subset(regressions,Tissue==tissue)[['Int Var']] +
^2 * subset(regressions,Tissue==tissue)[['Slp Var']] +
x.cf2 * x.cf * subset(regressions,Tissue==tissue)[['Cov']] + 1)
<- subset(regressions,Tissue==tissue)[['Intercept']] +
line * subset(regressions,Tissue==tissue)[['Slope']]
x.cf <- line + conf
y.cf <- line - conf
y.ncf
<- cbind(as.data.frame(x.cf),as.data.frame(y.cf),as.data.frame(y.ncf))
cf if(tissue == 'Blood Cells'){
eval(parse(text= paste('Blood <- ggplot() +
geom_abline(linetype = "dashed") +
geom_point(data=subset(pc.table,Tissue == \'',tissue,'\'),aes(10^(Predicted),
10^(Experimental))) + theme(axis.text=element_text(size=14),
axis.title=element_text(size=14),plot.title=element_text(size=14)) +
scale_x_log10(label=scientific_10,limits=c(0.01,1000)) +
scale_y_log10(label=scientific_10,limits=c(0.01,1000)) +
ylab(ifelse(tissue=="Brain",expression(paste("Inferred ",K[p])),"")) +
xlab(ifelse(tissue=="Skin", expression(paste("Predicted ",K[p])),"")) +
geom_line(data=cf,aes(10^(x.cf),10^(y.cf))) +
geom_line(data=cf,aes(10^(x.cf),10^(y.ncf))) +
geom_abline(intercept=subset(regressions,Tissue==tissue)[[\'Intercept\']],
slope=subset(regressions,Tissue==tissue)[[\'Slope\']]) +
ggtitle(\'Red Blood Cells\')',sep='')))
else{
}eval(parse(text= paste(tissue,' <- ggplot() + labs(y=expression(paste("Measured ",K[p]))
,x=expression(paste("Predicted ",K[p]))) + geom_abline(linetype = "dashed") +
geom_point(data=subset(pc.table,Tissue == \'',tissue,'\'),
aes(10^(Predicted),10^(Experimental))) + theme(axis.text=element_text(size=14),
axis.title=element_text(size=14),plot.title=element_text(size=14)) +
scale_x_log10(label=scientific_10,limits=c(0.01,1000)) +
scale_y_log10(label=scientific_10,limits=c(0.01,1000)) +
ylab(ifelse(tissue=="Brain",expression(paste("Inferred ",K[p])),"")) +
xlab(ifelse(tissue=="Skin", expression(paste("Predicted ",K[p])),"")) +
geom_line(data=cf,aes(10^(x.cf),10^(y.cf))) +
geom_line(data=cf,aes(10^(x.cf),10^(y.ncf))) +
geom_abline(intercept=subset(regressions,Tissue==tissue)[[\'Intercept\']],
slope=subset(regressions,Tissue==tissue)[[\'Slope\']]) +
ggtitle(\'',tissue,'\')',sep='')))
}
}
grid.arrange(Adipose, Blood, Bone, Brain, Gut, Heart, Kidney, Liver, Lung, Muscle, Skin, Spleen, nrow=4)
In vivo volume of distribution data are compared with the predictions, and errors are calculated. Regressions and improvements are calculated and plotted.
<- subset(Obach2008,CAS %in% get_cheminfo(model='schmitt'))
obach <- NULL
vd.table
for(this.cas in obach[,'CAS']){
<- parameterize_schmitt(
parameters chem.cas=this.cas,
suppress.messages=TRUE)
<- parameters
init.parameters if(!this.cas %in% ma.data.list){
$MA <- 10^(0.999831 - 0.016578*37 + 0.881721 * log10(parameters$Pow))
init.parameters
}<- predict_partitioning_schmitt(
pcs parameters=parameters,
regression=FALSE,
suppress.messages=TRUE)
<- predict_partitioning_schmitt(
init.pcs parameters=init.parameters,
regression=FALSE,
suppress.messages=TRUE)
<- predict_partitioning_schmitt(
reg.pcs parameters=parameters,
regression=TRUE,
suppress.messages=TRUE)
<- calc_vdist(
vdist parameters=c(pcs,Funbound.plasma=parameters$Funbound.plasma),
suppress.messages=TRUE)
<- calc_vdist(
init.vdist parameters=c(
init.pcs,Funbound.plasma=parameters$unadjusted.Funbound.plasma),
suppress.messages = TRUE)
<- calc_vdist(
reg.vdist parameters=c(reg.pcs,Funbound.plasma=parameters$Funbound.plasma),
suppress.messages = TRUE)
<- rbind(
vd.table
vd.table,cbind(as.data.frame(this.cas),as.data.frame(log10(init.vdist)),
as.data.frame(log10(vdist)),as.data.frame(log10(reg.vdist)),
as.data.frame(log10(subset(obach,CAS==this.cas)[['VDss (L/kg)']]))))
}colnames(vd.table) <- c(
'CAS',
'init.vdist',
'corrected.vdist',
'calibrated.vdist',
'Experimental')
<- vd.table[,'Experimental'] - vd.table[,'init.vdist']
init.error <- vd.table[,'Experimental'] - vd.table[,'corrected.vdist']
correction.error <- vd.table[,'Experimental'] - vd.table[,'calibrated.vdist']
calibration.error <- abs(init.error) - abs(correction.error)
correction.improvement <- abs(correction.error) - abs(calibration.error)
calibration.improvement <- cbind(vd.table,correction.improvement,calibration.improvement,
vd.table init.error,correction.error,calibration.error)
<- lm(Experimental ~ calibrated.vdist,data=vd.table)
fit <- summary(fit)
smry <- cbind(as.data.frame(fit$coefficients['(Intercept)']),
calibrated.reg as.data.frame(fit$coefficients['calibrated.vdist']),
as.data.frame(smry$coefficients['calibrated.vdist','Pr(>|t|)']),
as.data.frame(smry$sigma),as.data.frame(smry$r.squared))
<- lm(Experimental ~ init.vdist,data=vd.table)
fit <- summary(fit)
smry <- cbind(as.data.frame(fit$coefficients['(Intercept)']),
init.reg as.data.frame(fit$coefficients['init.vdist']),
as.data.frame(smry$coefficients['init.vdist','Pr(>|t|)']),
as.data.frame(smry$sigma),as.data.frame(smry$r.squared))
<- lm(Experimental ~ corrected.vdist,data=vd.table)
fit <- summary(fit)
smry <- cbind(as.data.frame(fit$coefficients['(Intercept)']),
corrected.reg as.data.frame(fit$coefficients['corrected.vdist']),
as.data.frame(smry$coefficients['corrected.vdist','Pr(>|t|)']),
as.data.frame(smry$sigma),as.data.frame(smry$r.squared))
colnames(init.reg) <- colnames(corrected.reg) <-
colnames(calibrated.reg) <- c('Intercept','Slope','P-value','Std Err','R-squared')
<- ggplot(vd.table,aes(10^(init.vdist),10^(Experimental))) + geom_point() +
init.vd.plot geom_abline(intercept = init.reg[['Intercept']], slope = init.reg[["Slope"]]) +
geom_abline(linetype = "dashed") + xlab("Predicted Volume of Distribution") +
ylab("Measured Volume of Distribution") + theme(axis.text=element_text(size=16),
axis.title=element_text(size=16),plot.title=element_text(size=18,hjust = 0.5)) +
scale_x_log10(label=scientific_10,limits=c(10^(-1.5),10^(8.5))) +
scale_y_log10(label=scientific_10,limits=c(10^(-1.5),10^(8.5))) +
ggtitle('(A)')
print(init.vd.plot)
<- ggplot() +
correction.plot geom_point(data=vd.table[order(vd.table[,'correction.improvement'],decreasing=F),],
aes(10^(corrected.vdist),10^(Experimental),color=correction.improvement)) +
geom_abline(intercept = corrected.reg[['Intercept']],slope = corrected.reg[["Slope"]]) +
geom_abline(linetype = "dashed") + xlab("Predicted Volume of Distribution") +
ylab("Measured Volume of Distribution") + theme(legend.position = c(.95, .95),
legend.justification = c("right", "top"),legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6),axis.text=element_text(size=16),
axis.title=element_text(size=16),plot.title=element_text(size=18,hjust = 0.5)) +
scale_x_log10(limits=c(10^(-1.5),10^(3))) + scale_y_log10(limits=c(10^(-1.5),10^(3))) +
ggtitle('(B)') + scale_color_viridis(direction=-1,option='inferno')
print(correction.plot)
<- ggplot() +
calibration.plot geom_point(data=vd.table[order(vd.table[,'calibration.improvement'],decreasing=F),],
aes(10^(calibrated.vdist),10^(Experimental),color=calibration.improvement)) +
geom_abline(intercept = calibrated.reg[['Intercept']],slope = calibrated.reg[["Slope"]]) +
geom_abline(linetype = "dashed") + xlab("Predicted Volume of Distribution") +
ylab("Measured Volume of Distribution") + theme(legend.position = c(.95, .95),
legend.justification = c("right", "top"),legend.box.just = "right",
legend.margin = margin(6, 6, 6, 6),axis.text=element_text(size=16),
axis.title=element_text(size=16),plot.title=element_text(size=18,hjust = 0.5)) +
scale_x_log10(limits=c(10^(-1.5),10^(3))) + scale_y_log10(limits=c(10^(-1.5),10^(3))) +
ggtitle('(C)') + scale_color_viridis(direction=-1,option='inferno')
print(calibration.plot)
Now we pull in the in vivo blood to plasma ratios, use these to calculate the inferred red blood cell to plasma ratios, and then make predictions for these values. A censored regressions is performed, and predictions are plotted against errors.
<- subset(chem.physical_and_invitro.data,!is.na(Human.Rblood2plasma))
rb2p.data <- NULL
measured.rb2p <- NULL
measured.krbc <- NULL
predicted.rb2p <- NULL
predicted.krbc <- NULL
cas <- NULL
charge <- NULL
fup <- NULL
logP <- NULL
pka_donor <- NULL
pka_accept for(this.cas in rb2p.data[rb2p.data[,'CAS'] %in%
get_cheminfo(model='schmitt', suppress.messages = TRUE),'CAS'])
{<- get_rblood2plasma(chem.cas=this.cas)
rb2p <- (rb2p + .44 - 1) / 0.44
krbc <- c(measured.rb2p,rb2p)
measured.rb2p <- c(measured.krbc,krbc)
measured.krbc <- parameterize_schmitt(
parameters chem.cas=this.cas,
suppress.messages = TRUE)
<- predict_partitioning_schmitt(
pcs parameters=parameters,
suppress.messages=TRUE)
<- c(predicted.krbc,pcs[['Krbc2pu']] * parameters$Funbound.plasma)
predicted.krbc <- c(cas,this.cas)
cas <- c(charge,calc_ionization(chem.cas=this.cas,pH=7.4)$fraction_charged)
charge <- c(fup,parameters$unadjusted.Funbound.plasma)
fup <- c(logP,log10(parameters$Pow))
logP <- c(pka_donor,paste(parameters$pKa_Donor,collapse=','))
pka_donor <- c(pka_accept,paste(parameters$pKa_Accept,collapse=','))
pka_accept
}<- 1 - 0.44 + 0.44 * predicted.krbc
predicted.rb2p <- cbind(as.data.frame(cas),as.data.frame(predicted.rb2p),as.data.frame(measured.rb2p))
rb2p.table colnames(rb2p.table) <- c('cas','predicted.rb2p','measured.rb2p')
<- log10(rb2p.table[,'measured.rb2p']) - log10(rb2p.table[,'predicted.rb2p'])
error <- cbind(rb2p.table,error,charge,fup,logP)
rb2p.table
<- log10(measured.krbc) - log10(predicted.krbc)
error <- cbind(as.data.frame(cas),as.data.frame(predicted.krbc),as.data.frame(measured.krbc),
krbc.table as.data.frame(error),charge,fup,logP,pka_donor,pka_accept)
<- data.frame(x = predicted.krbc,
pdta y = measured.krbc)
$y[pdta$y <= 0.1] <- 0.1
pdta$Censoring <- factor(c("Not Censored","Censored")[as.numeric(pdta$y <= 0.1) + 1])
pdta<- measured.krbc
y <- cbind(rep(1, length(y)),-1 * log10(predicted.krbc))
x colnames(x) <- c("Intercept","Predicted")
<- as.numeric(y <= 0.1)
cc < 0.1] <- 0.1
y[y <- -log10(y)
y
<- censReg(y~x, data = pdta, left=0.1)
out $betas <- out$estimate out
<- ggplot() +
censored.regression geom_point(data=pdta,aes(x=x,y=y, color=Censoring)) +
scale_x_log10(limits=c(.0009,40)) + scale_y_log10(limits=c(.1,4),breaks=c(.1,.5,2.5)) +
labs(y=expression(paste("Inferred ",K[p])),x=expression(paste("Predicted ",K[p]))) +
geom_abline(intercept=0, slope=1, linetype='dashed') +
theme(axis.text=element_text(size=16),axis.title=element_text(size=16),
plot.title=element_text(size=18,hjust=0.5),legend.position = c(0.11, .8)) +
geom_abline(slope=out$betas[2],intercept=-out$betas[1]) + ggtitle('(B)')
print(censored.regression)
<- ggplot(rb2p.table,aes(predicted.rb2p,measured.rb2p)) +
rb2p.plot geom_point() + scale_x_log10(lim=c(.52,18)) +
scale_y_log10(lim=c(.52,2.5),breaks=c(0.5,1,2)) + geom_abline(linetype='dashed') +
labs(y=expression(paste("Measured Whole Blood ",K[p])),
x=expression(paste("Predicted Whole Blood ",K[p]))) +
theme(axis.text=element_text(size=16),axis.title=element_text(size=16),
plot.title=element_text(size=18,hjust=0.5)) + ggtitle('(A)')
print(rb2p.plot)
Lastly, we make the heatmap.
<- NULL
heatmap.table for(this.cas in get_cheminfo(model='schmitt')){
<- parameterize_schmitt(
parms chem.cas=this.cas,
suppress.messages = TRUE)
<- predict_partitioning_schmitt(
pcs parameters=parms,
suppress.messages = TRUE)
<- cbind(heatmap.table,log10(unlist(pcs)[1:11]*parms$Funbound.plasma))
heatmap.table
}rownames(heatmap.table) <- c('Adipose','Bone','Brain','Gut','Heart',
'Kidney','Liver','Lung','Muscle','Skin','Spleen')
colnames(heatmap.table) <- rep("",dim(heatmap.table)[2])
<- function (n, h = c(260, -328), c = 80, l = c(30, 100), power = 1.5,
pal fixup = TRUE, gamma = NULL, alpha = 1, ...)
{if (!is.null(gamma))
warning("'gamma' is deprecated and has no effect")
if (n < 1L)
return(character(0L))
<- rep(h, length.out = 2L)
h <- c[1L]
c <- rep(l, length.out = 2L)
l <- rep(power, length.out = 2L)
power <- seq(1, -1, length = n)
rval <- hex(polarLUV(L = l[2L] - diff(l) * abs(rval)^power[2L],
rval C = c * abs(rval)^power[1L], H = ifelse(rval > 0, h[1L],
fixup = fixup, ...)
h[2L])), if (!missing(alpha)) {
<- pmax(pmin(alpha, 1), 0)
alpha <- format(as.hexmode(round(alpha * 255 + 1e-04)),
alpha width = 2L, upper.case = TRUE)
<- paste(rval, alpha, sep = "")
rval
}return(rval)
}
<- function(x) hclust(x, method="ward.D2")
hclust.ave heatmap.2(heatmap.table,col=pal,trace="none", hclustfun=hclust.ave,
key.xlab=expression(paste("log10 ",K[p]," Value")),
key.ylab=expression(paste("Number of ",K[p])),
key.title="Partition Coefficient",xlab="Chemicals",cex.lab=2,margins=c(2,5))