The R2ucare
package contains R
functions to
perform goodness-of-fit tests for capture-recapture models. It also has
various functions to manipulate capture-recapture data.
First things first, load the R2ucare
package:
library(R2ucare)
There are 3 main data formats when manipulating capture-recapture
data, corresponding to the 3 main computer software available to fit
corresponding models: RMark
, E-SURGE
and
Mark
. With R2ucare
, it is easy to work with
any of these formats. We will use the classical dipper dataset, which is
provided with the package (thanks to Gilbert Marzolin for sharing his
data).
RMark
file# # read in text file as described at pages 50-51 in http://www.phidot.org/software/mark/docs/book/pdf/app_3.pdf
<- system.file("extdata", "dipper.txt", package = "RMark")
dipper <- RMark::import.chdata(dipper, field.names = c("ch", "sex"), header = FALSE)
dipper <- as.data.frame(table(dipper))
dipper str(dipper)
## 'data.frame': 64 obs. of 3 variables:
## $ ch : Factor w/ 32 levels "0000001","0000010",..: 1 2 3 4 5 6 7 8 9 10 ...
## $ sex : Factor w/ 2 levels "Female","Male": 1 1 1 1 1 1 1 1 1 1 ...
## $ Freq: int 22 12 11 7 6 6 10 1 1 5 ...
Get encounter histories, counts and groups:
<- matrix(as.numeric(unlist(strsplit(as.character(dipper$ch),""))),
dip.hist nrow = length(dipper$ch),
byrow = T)
<- dipper$Freq
dip.freq <- dipper$sex
dip.group head(dip.hist)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 0 0 0 0 0 0 1
## [2,] 0 0 0 0 0 1 0
## [3,] 0 0 0 0 0 1 1
## [4,] 0 0 0 0 1 0 0
## [5,] 0 0 0 0 1 1 0
## [6,] 0 0 0 0 1 1 1
head(dip.freq)
## [1] 22 12 11 7 6 6
head(dip.group)
## [1] Female Female Female Female Female Female
## Levels: Female Male
E-SURGE
filesLet’s use the read_headed
function.
<- system.file("extdata", "ed.txt", package = "R2ucare")
dipper <- read_headed(dipper) dipper
Get encounter histories, counts and groups:
<- dipper$encounter_histories
dip.hist <- dipper$sample_size
dip.freq <- dipper$groups
dip.group head(dip.hist)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 1 1 1 1 1 0
## [2,] 1 1 1 1 0 0 0
## [3,] 1 1 0 0 0 0 0
## [4,] 1 1 0 0 0 0 0
## [5,] 1 1 0 0 0 0 0
## [6,] 1 1 0 0 0 0 0
head(dip.freq)
## [1] 1 1 1 1 1 1
head(dip.group)
## [1] "male" "male" "male" "male" "male" "male"
Mark
filesLet’s use the read_inp
function.
<- system.file("extdata", "ed.inp", package = "R2ucare")
dipper <- read_inp(dipper, group.df = data.frame(sex = c("Male", "Female"))) dipper
Get encounter histories, counts and groups:
<- dipper$encounter_histories
dip.hist <- dipper$sample_size
dip.freq <- dipper$groups
dip.group head(dip.hist)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 1 1 1 1 1 1 0
## [2,] 1 1 1 1 0 0 0
## [3,] 1 1 0 0 0 0 0
## [4,] 1 1 0 0 0 0 0
## [5,] 1 1 0 0 0 0 0
## [6,] 1 1 0 0 0 0 0
head(dip.freq)
## [1] 1 1 1 1 1 1
head(dip.group)
## [1] "Male" "Male" "Male" "Male" "Male" "Male"
Split the dataset in females/males:
<- (dip.group == "Female")
mask <- dip.hist[mask,]
dip.fem.hist <- dip.freq[mask]
dip.fem.freq <- (dip.group == "Male")
mask <- dip.hist[mask,]
dip.mal.hist <- dip.freq[mask] dip.mal.freq
Tadaaaaan, now you’re ready to perform Test.3Sr, Test3.Sm, Test2.Ct and Test.Cl for females:
<- test3sr(dip.fem.hist, dip.fem.freq)
test3sr_females <- test3sm(dip.fem.hist, dip.fem.freq)
test3sm_females <- test2ct(dip.fem.hist, dip.fem.freq)
test2ct_females <- test2cl(dip.fem.hist, dip.fem.freq)
test2cl_females # display results:
test3sr_females
## $test3sr
## stat df p_val sign_test
## 4.985 5.000 0.418 1.428
##
## $details
## component stat p_val signed_test test_perf
## 1 2 0.858 0.354 0.926 Chi-square
## 2 3 3.586 0.058 1.894 Chi-square
## 3 4 0.437 0.509 0.661 Chi-square
## 4 5 0.103 0.748 -0.321 Chi-square
## 5 6 0.001 0.982 0.032 Chi-square
test3sm_females
## $test3sm
## stat df p_val
## 2.041 3.000 0.564
##
## $details
## component stat df p_val test_perf
## 1 2 1.542 1 0.214 Fisher
## 2 3 0 1 1 Fisher
## 3 4 0.499 1 0.48 Fisher
## 4 5 0 0 0 None
## 5 6 0 0 0 None
test2ct_females
## $test2ct
## stat df p_val sign_test
## 3.250 4.000 0.517 -0.901
##
## $details
## component dof stat p_val signed_test test_perf
## 1 2 1 0 1 0 Fisher
## 2 3 1 0 1 0 Fisher
## 3 4 1 0 1 0 Fisher
## 4 5 1 3.25 0.071 -1.803 Fisher
test2cl_females
## $test2cl
## stat df p_val
## 0 0 1
##
## $details
## component dof stat p_val test_perf
## 1 2 0 0 0 None
## 2 3 0 0 0 None
## 3 4 0 0 0 None
Or perform all tests at once:
overall_CJS(dip.fem.hist, dip.fem.freq)
## chi2 degree_of_freedom p_value
## Gof test for CJS model: 10.276 12 0.592
What to do if these tests are significant? If you detect a transient
effect, then 2 age classes should be considered on the survival
probability to account for this issue, which is straightforward to do in
RMark
(Cooch and White 2017; appendix
C) or E-SURGE
(Choquet et al. 2009). If trap dependence
is significant, Cooch
and White (2017) illustrate how to use a time-varying individual
covariate to account for this effect in RMark
, while Gimenez
et al. (2003) suggest the use of multistate models that can be
fitted with RMark
or E-SURGE
, and Pradel
and Sanz (2012) recommend multievent models that can be fitted in
E-SURGE
only.
Now how to assess the fit of a model including trap-dependence and/or transience? For example, let’s assume we detected a significant effect of trap-dependence, we accounted for it in a model, and now we’d like to know whether our efforts paid off. Because the overall statistic is the sum of the four single components (Test.3Sr, Test3.Sm, Test2.Ct and Test.Cl), we obtain a test for the model with trap-dependence as follows:
<- overall_CJS(dip.fem.hist, dip.fem.freq) # overall test
overall_test <- test2ct(dip.fem.hist, dip.fem.freq) # test for trap-dependence
twoct_test <- overall_test$chi2 - twoct_test$test2ct["stat"] # overall stat - 2CT stat
stat_tp <- overall_test$degree_of_freedom - twoct_test$test2ct["df"] # overall dof - 2CT dof
df_tp <- 1 - pchisq(stat_tp, df_tp) # compute p-value for null hypothesis:
pvalue # model with trap-dep fits the data well
pvalue
## stat
## 0.5338301
Read in the geese dataset. It is provided with the package (thanks to Jay Hestbeck for sharing his data).
<- system.file("extdata", "geese.inp", package = "R2ucare")
geese <- read_inp(geese) geese
Get encounter histories and number of individuals with corresponding histories
<- geese$encounter_histories
geese.hist <- geese$sample_size geese.freq
And now, perform Test3.GSr, Test.3.GSm, Test3G.wbwa, TestM.ITEC and TestM.LTEC:
test3Gsr(geese.hist, geese.freq)
## $test3Gsr
## stat df p_val
## 117.753 12.000 0.000
##
## $details
## occasion site stat df p_val test_perf
## 1 2 1 3.894777e-03 1 9.502378e-01 Chi-square
## 2 2 2 2.715575e-04 1 9.868523e-01 Chi-square
## 3 2 3 8.129814e+00 1 4.354322e-03 Chi-square
## 4 3 1 1.139441e+01 1 7.366526e-04 Chi-square
## 5 3 2 2.707742e+00 1 9.986223e-02 Chi-square
## 6 3 3 3.345916e+01 1 7.277633e-09 Chi-square
## 7 4 1 1.060848e+01 1 1.125702e-03 Chi-square
## 8 4 2 3.533332e-01 1 5.522323e-01 Chi-square
## 9 4 3 1.016778e+01 1 1.429165e-03 Chi-square
## 10 5 1 1.101349e+01 1 9.045141e-04 Chi-square
## 11 5 2 1.292013e-01 1 7.192616e-01 Chi-square
## 12 5 3 2.978513e+01 1 4.826802e-08 Chi-square
test3Gsm(geese.hist, geese.freq)
## $test3Gsm
## stat df p_val
## 302.769 119.000 0.000
##
## $details
## occasion site stat df p_val test_perf
## 1 2 1 23.913378 14 4.693795e-02 Fisher
## 2 2 2 24.810007 16 7.324561e-02 Fisher
## 3 2 3 11.231939 8 1.889004e-01 Fisher
## 4 3 1 36.521484 14 8.712879e-04 Fisher
## 5 3 2 21.365358 17 2.103727e-01 Chi-square
## 6 3 3 23.072982 10 1.048037e-02 Fisher
## 7 4 1 55.338866 8 3.793525e-09 Fisher
## 8 4 2 17.172011 11 1.028895e-01 Fisher
## 9 4 3 45.089296 10 2.095523e-06 Chi-square
## 10 5 1 9.061514 3 2.848411e-02 Chi-square
## 11 5 2 5.974357 4 2.010715e-01 Chi-square
## 12 5 3 29.217786 4 7.060092e-06 Chi-square
test3Gwbwa(geese.hist, geese.freq)
## $test3Gwbwa
## stat df p_val
## 472.855 20.000 0.000
##
## $details
## occasion site stat df p_val test_perf
## 1 2 1 19.5914428 2 5.568936e-05 Chi-square
## 2 2 2 37.8676763 2 5.986026e-09 Chi-square
## 3 2 3 4.4873614 1 3.414634e-02 Fisher
## 4 3 1 80.5903050 1 2.777187e-19 Chi-square
## 5 3 2 98.7610833 4 1.805369e-20 Chi-square
## 6 3 3 0.8071348 1 3.689687e-01 Fisher
## 7 4 1 27.7054638 1 1.412632e-07 Chi-square
## 8 4 2 53.6936048 2 2.190695e-12 Chi-square
## 9 4 3 25.2931602 1 4.924519e-07 Chi-square
## 10 5 1 43.6547442 1 3.917264e-11 Chi-square
## 11 5 2 50.9264976 2 8.738795e-12 Chi-square
## 12 5 3 29.4763896 2 3.974507e-07 Chi-square
testMitec(geese.hist, geese.freq)
## $testMitec
## stat df p_val
## 68.225 27.000 0.000
##
## $details
## occasion stat df p_val test_perf
## 1 2 14.26786 9 0.1131110166 Chi-square
## 2 3 30.83845 9 0.0003155246 Chi-square
## 3 4 23.11896 9 0.0059346109 Chi-square
testMltec(geese.hist, geese.freq)
## $testMltec
## stat df p_val
## 20.989 19.000 0.337
##
## $details
## occasion stat df p_val test_perf
## 1 2 14.102598 10 0.1683630 Chi-square
## 2 3 6.886428 9 0.6489427 Chi-square
Or all tests at once:
overall_JMV(geese.hist, geese.freq)
## chi2 degree_of_freedom p_value
## Gof test for JMV model: 982.588 197 0
If trap-dependence or transience is significant, you may account for these lacks of fit as in the Cormack-Jolly-Seber model example. If there are signs of a memory effect, it gets a bit trickier but you may fit a model to account for this issue using hidden Markov models (also known as multievent models when applied to capture-recapture data).
Several U-CARE
functions to manipulate and process
capture-recapture data can be mimicked with R
built-in
functions. For example, recoding multisite/state encounter histories in
single-site/state ones is easy:
# Assuming the geese dataset has been read in R (see above):
> 1] <- 1 geese.hist[geese.hist
So is recoding states:
# Assuming the geese dataset has been read in R (see above):
== 3] <- 2 # all 3s become 2s geese.hist[geese.hist
Also, reversing time is not that difficult:
# Assuming the female dipper dataset has been read in R (see above):
t(apply(dip.fem.hist, 1, rev))
What about cleaning data, i.e. deleting empty histories, and histories with no individuals?
# Assuming the female dipper dataset has been read in R (see above):
= (apply(dip.fem.hist, 1, sum) > 0 & dip.fem.freq > 0) # select non-empty histories, and histories with at least one individual
mask sum(!mask) # how many histories are to be dropped?
# drop these histories from dataset
dip.fem.hist[mask,] # from counts dip.fem.freq[mask]
Selecting or gathering occasions is as simple as that:
# Assuming the female dipper dataset has been read in R (see above):
c(1,4,6)] # pick occasions 1, 4 and 6 (might be a good idea to clean the resulting dataset)
dip.fem.hist[, <- apply(dip.fem.hist[,c(1,4,6)], 1, max) # gather occasions 1, 4 and 6 by taking the max
gather_146 1] <- gather_146 # replace occasion 1 by new occasion
dip.fem.hist[,<- dip.fem.hist[, -c(4,6)] # drop occasions 4 and 6 dip.fem.hist
Finally, suppressing the first encounter is achieved as follows:
# Assuming the geese dataset has been read in R (see above):
for (i in 1:nrow(geese.hist)){
<- min(which(geese.hist[i,] != 0)) # look for occasion of first encounter
occasion_first_encounter <- 0 # replace the first non zero by a zero
geese.hist[i, occasion_first_encounter]
}# delete empty histories from the new dataset
<- (apply(geese.hist, 1, sum) > 0) # select non-empty histories
mask sum(!mask) # how many histories are to be dropped?
# drop these histories from dataset
geese.hist[mask,] # from counts geese.freq[mask]
Besides these simple manipulations, several useful
U-CARE
functions needed a proper R
equivalent.
I coded a few of them, see the list below and ?name-of-the-function for
more details.
marray
build the m-array for single-site/state
capture-recapture data;multimarray
build the m-array for multi-site/state
capture-recapture data;group_data
pool together individuals with the same
encounter capture-recapture history.ungroup_data
split encounter capture-recapture
histories in individual ones.