This vignette aims to showcase a use case using the 2 main functions
of metajam
- download_d1_data
and
read_d1_files
to download one dataset from the DataOne data
repository.
There are two parameters required to run the download_d1_data.R function in metajam. One is the data url for the dataset you’d like to download.You can retrieve this by navigating to the data package of interest, right-clicking on the download data button, and selecting Copy Link Address.
For several DataOne member nodes (Arctic Data Center, Environmental Data Initiative, and The Knowledge Network for Biocomplexity), metajam users can retrieve the data url from either the ‘home’ site of the member node or the from the DataOne instance of that same data package. For example, if you wanted to download this dataset:
Kelsey J. Solomon, Rebecca J. Bixby, and Catherine M. Pringle. 2021. Diatom Community Data from Coweeta LTER, 2005-2019. Environmental Data Initiative. https://doi.org/10.6073/pasta/25e97f1eb9a8ed2aba8e12388f8dc3dc.
You have two options for where to obtain the data url.
You could navigate to this page on the Environmental Data Initiative site (https://doi.org/10.6073/pasta/25e97f1eb9a8ed2aba8e12388f8dc3dc ) and right-click on the CWT_Hemlock_Diatom_Data.csv link to retrieve this data url: https://portal.edirepository.org/nis/dataviewer?packageid=edi.858.1&entityid=15ad768241d2eeed9f0ba159c2ab8fd5
You could fine this data package on the DataOne site (https://search.dataone.org/view/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fedi%2F858%2F1) and right-click the Download button next to CWT_Hemlock_Diatom_Data.csv to retrieve this data url:https://cn.dataone.org/cn/v2/resolve/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fdata%2Feml%2Fedi%2F858%2F1%2F15ad768241d2eeed9f0ba159c2ab8fd5
Both will work with metajam! You will get the same output either way.
We have not tested metajam’s compatibility with the home sites of all DataOne member nodes. If you are using metajam to download data from a member node other than ADC, EDI, or KNB we highly recommend retrieving the data url from the DataOne instance of the package (example 2 above).
We include two examples, one downloading a dataset with metadata in eml (ecological metadata format) and the other downloading a dataset with metadata in ISO (International Organization for Standardization) format.
For the first example, we are using Diatom Community Data from Coweeta LTER, 2005-2019: Kelsey J. Solomon, Rebecca J. Bixby, and Catherine M. Pringle. Environmental Data Initiative. https://pasta.lternet.edu/package/metadata/eml/edi/858/1.
# Create the local directory to download the datasets
dir.create(path_folder, showWarnings = FALSE)
# Download the dataset and associated metdata
data_folder <- metajam::download_d1_data(data_url, path_folder)
At this point, you should have the data and the metadata downloaded
inside your main directory; Data_coweeta
in this example.
metajam
organize the files as follow:
my_data.csv
__full_metadata.xml
:
my_data__full_metadata.xml
__summary_metadata.csv
:
my_data__summary_metadata.csv
__attribute_metadata.csv
:
my_data__attribute_metadata.csv
__attribute_factor_metadata.csv
:
my_data__attribute_factor_metadata.csv
You have now loaded in your R environment one named list object that
contains the data coweeta_diatom$data
, the general
(summary) metadata coweeta_diatom$summary_metadata
- such
as title, creators, dates, locations - and the attribute level metadata
information coweeta_diatom$attribute_metadata
, allowing
user to get more information, such as units and definitions of your
attributes.
For the second example, we are using Marine bird survey observation and density data from Northern Gulf of Alaska LTER cruises, 2018. Kathy Kuletz, Daniel Cushing, and Elizabeth Labunski. Research Workspace. https://doi.org/10.24431/rw1k45w
# Create the local directory to download the datasets
dir.create(path_folder, showWarnings = FALSE)
# Download the dataset and associated metdata
data_folder <- metajam::download_d1_data(data_url, path_folder)
At this point, you should have the data and the metadata downloaded
inside your main directory; Data_alaska
in this example.
metajam
organize the files as follow:
my_data.csv
__full_metadata.xml
:
my_data__full_metadata.xml
__summary_metadata.csv
:
my_data__summary_metadata.csv
You have now loaded in your R environment one named list object that
contains the data coweeta_diatom$data
, the general
(summary) metadata coweeta_diatom$summary_metadata
- such
as title, creators, dates, locations - and the attribute level metadata
information coweeta_diatom$attribute_metadata
, allowing
user to get more information, such as units and definitions of your
attributes.