Thank you for your interest in the childfree package! This vignette illustrates how to use this package to access and harmonize demographic data to study childfree individuals. Childfree individuals are people who neither have nor want children (Neal and Neal 2023). The fact that they do not want children makes them different from other types of non-parents, including not-yet-parents who want children in the future, childless individuals who cannot have children, undecided people who do not know if they want children in the future.
The childfree
package can be cited as:
Neal, Z. P. and Neal, J. W. (2024). childfree: An R package to access and harmonize childfree demographic data. Comprehensive R Archive Network. https://cran.r-project.org/package=childfree
For additional resources on the childfree package, please see https://www.zacharyneal.com/childfree.
If you have questions about the childfree package or would like a childfree package hex sticker, please contact the maintainer Zachary Neal by email (zpneal@msu.edu). Please report bugs in the backbone package at https://github.com/zpneal/childfree/issues.
The childfree package can be loaded in the usual way:
library(childfree)
#> N ▇─┬─O childfree v0.0.3
#> O ┆ CITE: Neal, Z. P. and Neal, J. W., (2024). childfree: An R package to access and
#> K ▇─┬─O harmonize childfree demographic data. CRAN. https://doi.org/10.32614/CRAN.package.childfree
#> I ┆ HELP: type vignette("childfree"); email zpneal@msu.edu; github zpneal/childfree
#> D ╳ BETA: type devtools::install_github("zpneal/childfree", ref = "devel")
Upon successful loading, a startup message will display that shows the version number, citation, ways to get help, and ways to contact us.
The primary use of the childfree package is to obtain demographic data about childfree individuals from publicly available sources. Each section of this vignette describes the data sources that are available, which include:
Future releases will offer access to additional data sources, and will harmonize data extracted from different sources.
The childfree package uses the theoretical framework and terminology defined by Neal and Neal (2023).
Childfree: A person who does not have children and does not want children, regardless of whether they can have children, is called “childfree”. In contrast, a person who does not have children but cannot have children for biological or non-biological reasons is called “childless”.
Family Status: The terms “childfree” and “childless” are examples of “family statuses”. The “ABC” framework describes how a person’s family status is defined by the intersection of:
A “parent” has had children (behavior = yes). Parents may be “fulfilled” if they have had exactly the number of children they want, “unfulfilled” if they have had fewer children than they want, “reluctant” if they have had more children than they want, or “ambivalent” if they do not know how many children they want(ed).
A non-parent had not had children (behavior = no). However, attitudes and circumstances distinguish different types of non-parents. A “childfree” person does not want children (attitude = no), while a “childless” person wants children (attitude = yes) but experienced barriers (circumstance = yes), and a “not-yet-parent” wants children (attitude = yes) and has not experienced barriers (circumstance = no). Non-parent family statuses are “momentary,” which means they describe a person’s status at the moment. However, a person may transition between non-parent statuses, or from a non-parent status to a parent status, over time.
Questions: The childfree package is primarily focused on data concerning childfree individuals. Childfree individuals can be identified in survey data using a “WIDE” range of questions:
Each type of question has advantages and disadvantages, and can yield
different results when determining which survey respondents are
childfree. The dataframes generated by the childfree package contain one
binary variable for each of the “WIDE” questions available in a given
dataset (e.g., cf_want
, cf_ideal
), plus one
categorical variable (i.e., famstat
) that classifies all
respondents into family statuses using all available variables.
The childfree package provides access to data about childfree individuals, and aims to facilitate research on this population. However, care should be taken when analyzing data obtained using the package’s functions:
Operationalization: Although the functions in the childfree package aim to recode each dataset’s variables in a consistent and comparable way, there are subtle differences in how variables were originally operationalized that makes perfect harmonization and comparability impossible. Exercise caution comparing the same recoded variable across datasets.
Universes: Each dataset accessible through the childfree package was collected from a population-representative sample. However, there is variation in the universes from which these samples were drawn, and therefore variation in the populations of which they are representative. For example, while respondents for the SOSS were sampled from all adults in Michigan, the respondents for the NSFG were sampled from US women ages 15-44.
Weights: Generating population estimates from these data generally requires the use of sampling weights, which are included in the dataframes generated by the childfree package. However, use of sampling weights can be complex, particularly when combining samples from multiple waves, locations, or surveys. Exercise caution using the included sampling weights.
The following functions provide access to data on childfree individuals:
dhs()
- Demographic and Health Surveysnsfg()
- National Survey of Family Growthsoss()
- State of the State SurveyThe following sections provide more information about these data sources, and illustrate how these functions work. The detailed codebooks for the dataframes generated by these functions are provided in a separate vignette.
The Demographic and Health
Surveys (DHS) program has regularly collected health data
from population-representative samples in many countries using
standardized surveys since 1984. The “individual recode” data files
contain women’s responses, while the “men recode” files contain men’s
responses. These files are available in SPSS, SAS, and Stata formats
from https://www.dhsprogram.com/,
however access requires a free
application. Once one or more of these files has been downloaded,
the dhs()
function imports the data, extracts and recoded
selected variables, and returns a ready-to-use dataframe.
Although access to DHS data requires an application, the DHS program
provides [model datasets]{https://dhsprogram.com/data/Download-Model-Datasets.cfm}
containing fictitious data that do not require prior application to
access. The “ZZIR62FL.SAV” file is a model individual recode dataset in
SPSS format, and provides an example of how the dhs()
function works. Running
dat <- dhs(file = "ZZIR62FL.SAV", extra.vars = c("v201", "v602", "v613"))
#> Processing DHS data files -
imports the data, extracts and recodes variables, and returns an R
dataframe called dat
. If you are offline or these data are
otherwise unavailable, then dat <- NULL
.
Inspecting selected variables for a selected observation in
dat
if (!is.null(dat)) {t(dat[2368,c(3:9,19:21)])}
#> 2368
#> famstat "Childfree"
#> sex "Female"
#> age "18"
#> education "12"
#> partnered "Never"
#> residence "Urban"
#> employed "0"
#> year "2015"
#> month "August"
#> v201 "0"
we see that it contains the record of an unemployed 18 year old female, who has completed 12 years of education, and is currently single, living in an urban area. She is classified as childfree because she does not have or want children.
Specifying extra.vars = c("v201", "v602", "v613")
requests that the function also include these three variables, which are
not extracted and recoded by default. The three variables requested in
this example are the raw source variables from which dhs()
determines each respondent’s family status: v201
contains
the respondent’s number of children, v602
contains a code
indicating whether the respondent wants children (3 = no), and
v613
contains the respondent’s ideal number of children. In
practice, using the extra.vars
option can be useful to
retain other information collected by DHS, but that is not automatically
included.
The National Survey
of Family Growth (NSFG) is conducted by the U.S. Centers
for Disease Control, andregularly collects fertility and other health
information from a population-representative sample of adults in the
United States. Between 1973 and 2002, the NSFG was conducted
periodically. Starting in 2002, the NSFG transitioned to continuous data
collection, releasing data in three-year waves (e.g., the 2013-2015,
2015-2017). The nsfg()
function reads raw data directly
from the CDC website, extracts and recoded selected variables, and
returns a ready-to-use dataframe.
For, example, we can obtain data from the NSFG collected between 2017 and 2019 using:
dat <- nsfg(years = 2017)
#> Processing NSFG data files -
#> | | | 0% | |=================================== | 50% | |======================================================================| 100%
which returns the data in a dataframe called dat
. If you
are offline or these data are otherwise unavailable, then
dat <- NULL
. Inspecting selected variables for a
selected observation in dat
if (!is.null(dat)) {t(dat[14,2:12])}
#> 14
#> famstat "Childfree"
#> sex "Female"
#> lgbt "Straight"
#> race "White"
#> hispanic "0"
#> age "26"
#> education "College graduate"
#> partnered "Never"
#> residence "Urban"
#> employed "1"
#> inschool "0"
we see that it contains the record of a 26 year old non-hispanic white female. She is an employed college graduate living without a partner in the city, and does not identify as religious. She is classified as childfree because she does not have or want children.
The State
of the State Survey (SOSS) is regularly collected by the
Institute for Public Policy and Social Research (IPPSR) at Michigan
State University (MSU). Each wave is collected from a sample of 1000
adults in the US state of Michigan, and includes sampling weights to
obtain a sample that is representative of the state’s population with
respect to age, gender, race, and education. All waves contain the same
basic demographic information, but each wave also includes questions
about topics commissioned by MSU faculty and others. The
soss()
function provides access to the waves that contain
questions that allow childfree adults to be identified. It reads raw
data directly from the IPPSR website, extracts and recoded selected
variables, and returns a ready-to-use dataframe.
For, example, we can obtain data from the 84th SOSS wave, which was collected in April 2022 using:
dat <- soss(waves = 84, extra.vars = (c("neal1", "neal2", "neal3")))
#> Processing SOSS data files -
#> | | | 0% | |======================================================================| 100%
which returns the data in a dataframe called dat
. If you
are offline or these data are otherwise unavailable, then
dat <- NULL
. Inspecting selected variables for a
selected observation in dat
if (!is.null(dat)) {t(dat[2,c(2:9,12:13,21:24)])}
#> 2
#> famstat "Childfree"
#> sex "Female"
#> lgbt "1"
#> race "White"
#> hispanic "0"
#> age "60"
#> education "College graduate"
#> partnered "Currently"
#> inschool "0"
#> ideology "Closer to the liberal side"
#> year "2022"
#> month "April"
#> neal1 "2"
#> neal2 "2"
we see that it contains the record of a 60 year old non-hispanic white female. She is a college graduate living with her partner in the suburbs, and identifies as slightly liberal, but not as religious. She is classified as childfree because she does not have or want children.
Specifying extra.vars = c("neal1", "neal2", "neal3")
requests that the function also include these three variables, which are
not extracted and recoded by default. The three variables requested in
this example are the raw source variables from which soss()
determines each respondent’s family status: neal1
contains
a code indicating whether the respondent has children (2 = no),
neal2
contains a code indicating whether the respondent is
planning to have children (2 = no), and neal3
contains a
code indicating whether the respondent want(ed) to have children (2 =
no). In practice, using the extra.vars
option can be useful
to retain other information collected by IPPSR, but that is not
automatically included.