This vignette introduces you to sizeMat
package and provide a way to estimate Size at Morphometric and Gonad Maturity.
install.packages("sizeMat")
Or the development version from github:
# install.packages("devtools")
devtools::install_github("ejosymart/sizeMat")
For estimating morphometric maturity use the crabdata base. The data set containing allometric measures and other attributes (year, month, sex category) of 223 crabs of the specie Chionectes tanneri.
data(crabdata)
head(crabdata)
## year month carapace_width carapace_length chela_height chela_width
## 1 1974 1 106 107 14.0 22
## 2 1974 1 129 129 27.0 44
## 3 1974 1 119 122 14.6 23
## 4 1974 1 115 118 18.6 29
## 5 1974 1 97 97 11.0 17
## 6 1974 1 94 96 10.0 15
## sex_category
## 1 m
## 2 m
## 3 m
## 4 m
## 5 m
## 6 m
names(crabdata)
## [1] "year" "month" "carapace_width" "carapace_length"
## [5] "chela_height" "chela_width" "sex_category"
The estimation of morphometric maturity involves two processes:
First the data classification, which is referred to the classification of the individuals in two groups. The second is the estimation. This process uses the previous classification to estimate the size at morphometric maturity. The details are given below.
The classify_mature
function, classify the individuals in two groups (juveniles = 0 and adult = 1). The classification analysis is based on Principal Components Analysis with two allometric variables (x: independent variable, y: dependent variable) in log base, allowing to distinguish two groups that would represent juveniles and adult. The individuals are assigned to each group using a hierarchical classification procedure (hierarchical cluster with agglomeration method: “Ward.D” and the distance measure: “euclidean”). This method is based on establishing a predetermined number of groups (in this case, two) and assigning individuals to one of the groups according to their loads on the two axes of the PCA (Corgos and Freire, 2006).
Using the results of the classification (PCA + cluster), a discriminant analysis (linear or quadratic) is conducted to obtain a discriminating function that permitted any individuals to be classified as a juvenile or an adult on the basis of the X and Y variables.
The classify_mature
function requires a data.frame (e.g. crabdata) with allometric variables and sex category. The argument varNames
requires the name of two allometric variables only, and varSex
requires the name of the variable containing sex information. If the argument selecSex
is NULL
all the individuals will be used in the classification analysis. Finally the method
is focus in the discriminant analysis to be used (“ld”: linear discriminant analysis, “qd”: quadratic discriminant analysis). We recommend begin the analysis with the method = "ld"
.
The classify_mature
function returns an object of class “classify”, with the allometric variables “x” (independent) - “y”" (dependent), and classification of maturity (juveniles = 0, adult = 1).
#For all the individuals
classify_data = classify_mature(crabdata, varNames = c("carapace_width", "chela_height"),
varSex = "sex_category", selectSex = NULL, method = "ld")
## all individuals were used in the analysis
#For males only
classify_data_males = classify_mature(crabdata, varNames = c("carapace_width", "chela_height"),
varSex = "sex_category", selectSex = "m", method = "ld")
## only m-sex were used in the analysis
Print the results of the “classify_data” object. It shows the number of juveniles and adults after classification and the linear regression analysis for juveniles and adults.
print(classify_data)
## Number in juvenile group = 83
##
## Number in adult group = 140
##
## --------------------------------------------------------
## 1) Linear regression for juveniles
##
## Call:
## glm(formula = y ~ x, data = juv)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -2.77010 -0.57399 0.09397 0.56605 1.99008
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.794687 0.497056 -7.634 3.93e-11 ***
## x 0.161327 0.004701 34.314 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 0.7320842)
##
## Null deviance: 921.306 on 82 degrees of freedom
## Residual deviance: 59.299 on 81 degrees of freedom
## AIC: 213.63
##
## Number of Fisher Scoring iterations: 2
##
## --------------------------------------------------------
## 2) Linear regression for adults
##
## Call:
## glm(formula = y ~ x, data = adt)
##
## Deviance Residuals:
## Min 1Q Median 3Q Max
## -3.3055 -1.0932 -0.0628 1.1178 3.2759
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -11.246726 1.199496 -9.376 <2e-16 ***
## x 0.273837 0.008648 31.663 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 2.265729)
##
## Null deviance: 2584.24 on 139 degrees of freedom
## Residual deviance: 312.67 on 138 degrees of freedom
## AIC: 515.79
##
## Number of Fisher Scoring iterations: 2
##
## --------------------------------------------------------
## 3) Difference between slopes (ANCOVA)
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -3.7946869 0.757105677 -5.012097 1.109526e-06
## x 0.1613275 0.007161179 22.528064 6.035478e-59
## mature -7.4520389 1.285219562 -5.798261 2.320729e-08
## x:mature 0.1125093 0.010361046 10.858878 2.956242e-22
## [1] "slopes are different"
The example shows the arguments that can be used in the plot for “classify_data” object:
par(mfrow = c(2,2))
plot(classify_data)
plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)", legendPlot = FALSE)
legend("topleft", "Put your legend here", bty = "n")
plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)",
col = c(2, 3), pch = c(5, 6), legendPlot = TRUE)
plot(classify_data, xlab = "Carapace width (mm.)", ylab = "Chela height (mm)",
col = c(2, 3), pch = c(5, 6), lty_lines = c(1, 2), lwd_lines = c(1, 3),
cex = c(1, 3), main = "Classification")
The morph_mature
function use the logit approach (frequentist or bayesian). The size at 50% maturity (\(L_{50}\)) was estimated as the length at which a randomly chosen specimen has a 50% chance of being mature (Somerton 1980, Roa et al. 1999).
In the regression analysis, \(X\) (e.g: carapace width) is considered the explanatory variable and the classification of maturity \(CS\) (juveniles: 0, adults: 1) is considered the response variable (binomial). The variables are fitted to a logit function with the form:
\[P_{CS} = \frac{1}{1+e^{-(\hat{\beta}_{0} + \hat{\beta}_{1}*X)}}\]
where \(P_{CS}\) is the probability of an individual of being mature at a determinate \(X\) length. \(\hat{\beta}_{0}\) (intercept) and \(\hat{\beta}_{1}\) (slope) are parameters estimated. The \(L_{50}\) is calculated as:
\[L_{50} = -\frac{\hat{\beta}_{0}}{\hat{\beta}_{1}}\]
The morph_mature
function requires an object of class “classify” with the X, Y (allometric variables) and classification of maturity (juveniles = 0, adults = 1).
The argument method
requires a character string indicating which regression will be used for the test. If method = "fq"
the logit regression is based on GLM (frequentist) and if method = "bayes"
a sample from the posterior distribution of a logit regression model using a random walk Metropolis algorithm is generated (see MCMClogit function).
The argument niter
requires a number. For the GLM regression (method = "fq"
), a non-parametric bootstrap method consists in generate B bootstrap samples, by resampling with replacement the original data. Then all statistics for each parameter can be calculated from each bootstrap sample (median and confidence intervals). For the method = "bayes"
, the argument niter
is related to the number of Metropolis iterations for the sampler.
The output is an object of class “morphMat”. This object contains a dataframe with the allometric variables X - Y and the classification of maturity. Also the fitted values for the logit regression and confidence intervals (95%). If you print the object, the median of the size at morphometric maturity estimation (\(L_{50}\)) and parameters are shown.
#Frequentist regression
my_ogive_fq = morph_mature(classify_data, method = "fq", niter = 1000)
print(my_ogive_fq)
## formula: Y = 1/1+exp-(A + B*X)
## Original Bootstrap (Median)
## A -20.753 -20.8586
## B 0.1748 0.1757
## L50 118.7237 118.6616
## R2 - 0.7111
#Bayesian regression
my_ogive_bayes = morph_mature(classify_data, method = "bayes", niter = 1000)
print(my_ogive_bayes)
## formula: Y = 1/1+exp-(A + B*X)
## Bootstrap (Median)
## A -20.6964
## B 0.1744
## L50 118.4079
## R2 0.7111
For plotting the maturity ogive, an object of class “morphMat” is required. The function plot
generates 4 graphics: 1), 2) and 3) are histograms for the A, B parameters and the size at morphometric maturity (\(L_{50}\)), the last is the maturity ogive.
par(mfrow = c(2,2))
plot(my_ogive_fq, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red"))
## Size at morphometric maturity = 118.7
## Confidence intervals = 116.2 - 121.1
## Rsquare = 0.71
par(mfrow = c(2,2))
plot(my_ogive_bayes, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red"))
## Size at morphometric maturity = 118.4
## Confidence intervals = 115.9 - 120.8
## Rsquare = 0.71
If you want the maturity ogive plot only, you have to add the param onlyOgive = TRUE
. Besides you can modify the axis (size, rotation, etc) and add the legend.
plot(my_ogive_fq, xlab = "Carapace width (mm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE)
## Size at morphometric maturity = 118.7
## Confidence intervals = 116.2 - 121.1
## Rsquare = 0.71
This methodology has been used mainly in the estimation of morphological sexual maturity in crabs, but it can be extended to other taxas as Agostinho (2000) reported.
For estimating gonadal maturity use the matFish database. This database contains two variables:
total_length
: Total length in cm.
stage_mat
: The gonadal maturation stages: I, II, III, IV, where I is considered immature.
data(matFish)
head(matFish)
## total_length stage_mat
## 1 12 I
## 2 12 I
## 3 13 I
## 4 14 I
## 5 14 I
## 6 14 I
The function to be used to estimate gonadal maturity is gonad_mature
. This function use the logistic approach.
The gonad_mature
function requires a data.frame with allometric variables (e.g: total length, fork length, carapace width, etc) and a variable containing the stages of sexual maturity (gonadal maturation stages).
The argument varNames
requires a character string indicating the name of one allometric and the stage of sexual maturity variable to be used for analysis (e.g varNames = c("total_length", "stage_mat")
). So the argument varNames
must contain two character strings only, the first is the allometric variable and the second is the stage of sexual maturity.
The arguments inmName
and matName
require a character string indicating the name of the stages of sexual maturity in the data.frame. The argument could contain one character string or could be a vector (e.g inmName = "I"
, matName = c("II", "III", "IV")
). The variable stage_mat in the matFish database, contains the stages of the sexual maturity. In this case, stage I is considered immature and II, III, IV are mature. Then the stages of sexual maturity are transformed in a binomial variable where immature = 0 and mature = 1.
The argument method
requires a character string indicating which regression will be used for the test. If method = "fq"
the logit regression is based on GLM (frequentist) and if method = "bayes"
a sample from the posterior distribution of a logit regression model using a random walk Metropolis algorithm is generated (see MCMClogit function).
The argument niter
requires a number. For the GLM regression (method = "fq"
), a non-parametric bootstrap method consists in generate B bootstrap samples, by resampling with replacement the original data. Then all statistics for each parameter can be calculated from each bootstrap sample (median and confidence intervals). For the method = "bayes"
, the argument niter
is related to the number of Metropolis iterations for the sampler.
The output is an object of class “gonadMat”. This object contains a dataframe with the allometric variable “X” and stage of sexual maturity (immature = 0, mature = 1). Also the fitted values for the curve logistic regression and confidence intervals (95%). If you print the object, the median of the size at gonad maturity estimation (\(L_{50}\)) and parameters are shown.
#Frequentist regression
my_ogive_fq = gonad_mature(matFish, varNames = c("total_length", "stage_mat"), inmName = "I",
matName = c("II", "III", "IV" ), method = "fq", niter = 999)
print(my_ogive_fq)
## formula: Y = 1/1+exp-(A + B*X)
## Original Bootstrap (Median)
## A -8.6047 -8.6407
## B 0.356 0.3576
## L50 24.1694 24.1714
## R2 0.5595 -
#Bayesian regression
my_ogive_bayes = gonad_mature(matFish, varNames = c("total_length", "stage_mat"), inmName = "I",
matName = c("II", "III", "IV" ), method = "bayes", niter = 999)
print(my_ogive_bayes)
## formula: Y = 1/1+exp-(A + B*X)
## Bootstrap (Median)
## A -8.4974
## B 0.3522
## L50 24.1295
## R2 0.5595
For plotting the maturity ogive the object of class “gonadMat”is required. The function plot
generates 4 graphics: 1), 2) and 3) are histograms for the A, B parameters and the size at gonadal maturity (\(L_{50}\)), the last is the maturity ogive.
par(mfrow = c(2,2))
plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"))
## Size at gonad maturity = 24.2
## Confidence intervals = 23.8 - 24.6
## Rsquare = 0.56
par(mfrow = c(2,2))
plot(my_ogive_bayes, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"))
## Size at gonad maturity = 24.1
## Confidence intervals = 23.7 - 24.5
## Rsquare = 0.56
If you want the maturity ogive plot only, you have to add the param onlyOgive = TRUE
. Besides you can modify the axis (size, rotation, etc) and add the legend.
plot(my_ogive_fq, xlab = "Total length (cm.)", ylab = "Proportion mature", col = c("blue", "red"), onlyOgive = TRUE)
## Size at gonad maturity = 24.2
## Confidence intervals = 23.8 - 24.6
## Rsquare = 0.56
Agostinho, C. S. (2000). Use of otoliths to estimate size at sexual maturity in fish. Brazilian Archives of Biology and Technology, 43(4).
Corgos, A. and Freire, J. (2006). Morphometric and gonad maturity in the spider crab Maja brachydactyla: a comparison of methods for estimating size at maturity in species with determinate growth. ICES Journal of Marine Science: Journal du Conseil, 63(5), 851-859.
Roa, R., Ernst, B. and Tapia, F. (1999). Estimation of size at sexual maturity: an evaluation of analytical and resampling procedures. Fishery Bulletin, 97(3), 570-580.
Somerton, D. A. (1980). A computer technique for estimating the size of sexual maturity in crabs. Canadian Journal of Fisheries and Aquatic Sciences, 37(10), 1488-1494.