Hydropeaking causes an important environmental impact on running water ecosystems. Many affected rivers have a poor ecological status. In rivers affected by hydropeaking, the flow conditions are highly complex and difficult to grasp. The implemented event-based algorithm detects flow fluctuations corresponding to increase events (IC) and decrease events (DC). For each event, a set of parameters related to the fluctuation intensity is calculated: maximum flow fluctuation rate (MAFR), mean flow fluctuation rate (MEFR), amplitude (AMP), flow ratio (RATIO), and duration (DUR).
Greimel et al. (2016) introduced a framework for detecting and characterising sub-daily flow fluctuations. By analysing more than 500 Austrian hydrographs, covering the whole range from unimpacted to heavily impacted rivers, different fluctuation types could be identified according to the potential source: e.g., sub-daily flow fluctuations caused by hydropeaking, rainfall or snow and glacier melt. The hydropeak package enables detecting flow fluctuation events in a given time series by computing differences between consecutive time steps and calculating flow fluctuation parameters.
To detect flow fluctuation events, hydropeak needs an input data frame that contains at least a column with an ID of the gauging station, a column with date-time values, and a column with flow rates (Q). To use the functions of hydropeak properly, the input data frame has to be converted to a S3 class called flow
. This happens by default in the main function get_metrics()
, where the column indices of these three variables can be passed. By default this is cols = c(1, 2, 3)
for ID, Time, Q
. Converting the data frame to a flow
object, makes sure that a standardised date-time format and valid data types will be used.
Q
is an example input dataset with 3 variables and 960 stage measurements (Q in \(m^{3}/s\)) from two different gauging stations. One time step is 15 minutes which corresponds to high-resolution data. The dataset is documented in ?Q
. We will use these data to demonstrate the detection of increase (IC) and decrease (DC) events and the computation of the metrics from Greimel et al. (2016).
dim(Q)
#> [1] 960 3
head(Q)
#> ID Time Q
#> 1 200000 01.01.2021 00:00 0.753
#> 2 200000 01.01.2021 00:15 0.753
#> 3 200000 01.01.2021 00:30 0.753
#> 4 200000 01.01.2021 00:45 0.752
#> 5 200000 01.01.2021 01:00 0.752
#> 6 200000 01.01.2021 01:15 0.752
To verify the results, we use the Events
dataset which also shows the output format of the main function get_metrics()
. Events
is the output of an ORACLE® database from the Institute of Hydrobiology and Aquatic Ecosystem Management, BOKU University, Vienna, Austria. It contains 165 IC and DC events and 8 variables and is documented in ?Events
.
dim(Events)
#> [1] 296 8
head(Events)
#> ID EVENT_TYPE Time AMP MAFR MEFR DUR RATIO
#> 1 200000 0 2021-01-01 00:00:00 0.000 0.000 0.000 2 1.000000
#> 2 200000 4 2021-01-01 00:30:00 0.001 0.001 0.001 1 1.001330
#> 3 200000 1 2021-01-01 00:45:00 0.000 0.000 0.000 3 1.000000
#> 4 200000 5 2021-01-01 01:30:00 NA NA NA 5 NA
#> 5 200000 0 2021-01-01 02:45:00 0.000 0.000 0.000 1 1.000000
#> 6 200000 4 2021-01-01 03:00:00 0.001 0.001 0.001 1 1.001332
get_events()
get_events()
is the main function and processes an input dataset such as Q
as follows:
flow()
converts Q
to a flow
object which is formatted to be compatible with the functions in hydropeak.change_points()
computes change points of the flow fluctuation where the flow is increasing (IC) or decreasing (DC). Optionally, constant events or NA events can be included.all_metrics()
for each event determined by change_points()
: all metrics according to Greimel et al. (2016) are calculated. get_events(Q, omit.constant = FALSE, omit.na = FALSE)
result <-head(result)
#> ID EVENT_TYPE Time AMP MAFR MEFR DUR RATIO
#> 1 200000 0 2021-01-01 00:00:00 0.000 0.000 0.000 2 1.000000
#> 2 200000 4 2021-01-01 00:30:00 0.001 0.001 0.001 1 1.001330
#> 3 200000 1 2021-01-01 00:45:00 0.000 0.000 0.000 3 1.000000
#> 4 200000 5 2021-01-01 01:30:00 NA NA NA 5 NA
#> 5 200000 0 2021-01-01 02:45:00 0.000 0.000 0.000 1 1.000000
#> 6 200000 4 2021-01-01 03:00:00 0.001 0.001 0.001 1 1.001332
all.equal(Events, result)
#> [1] TRUE
get_events_file()
and get_events_dir()
With get_events_file()
a file path can be provided as an argument. The function reads a file from the path and calls get_events()
. It returns the computed events by default. This can be disabled if the argument return
is set to FALSE
. All events can then be optionally written to a single file, together. Or if the argument split
is set to TRUE
, a separate file for each gauging station ID and event type is created. An output directory has to be provided, otherwise it writes to tempdir()
. The naming scheme of the output file is ID_event_type_date-from_date_to.csv
.
system.file("extdata", "Q.csv", package = "hydropeak")
Q_file <- file.path(tempdir(), "Events1")
outdir <-
get_events_file(Q_file, inputsep = ",", inputdec = ".",
events <-save = TRUE, split = TRUE, return = TRUE,
outdir = outdir)
head(events)
#> ID EVENT_TYPE Time AMP MAFR MEFR DUR RATIO
#> 1 200000 4 2021-01-01 00:30:00 0.001 0.001 0.001 1 1.001330
#> 2 200000 4 2021-01-01 03:00:00 0.001 0.001 0.001 1 1.001332
#> 3 200000 4 2021-01-01 05:30:00 0.001 0.001 0.001 1 1.001333
#> 4 200000 4 2021-01-01 07:45:00 0.001 0.001 0.001 1 1.001335
#> 5 200000 4 2021-01-01 10:15:00 0.001 0.001 0.001 1 1.001337
#> 6 200000 4 2021-01-01 12:45:00 0.001 0.001 0.001 1 1.001339
get_events_dir()
allows to read input files from directories and calls get_events_file()
for each file in the provided directory. The resulting events are split into separate files for each gauging station ID and event type and are written to the given output directory. If no output directory is provided, it writes to tempdir()
. The function does not return anything. The naming scheme of the output files is ID_event_type_date-from_date_to.csv
.
system.file("extdata", package = "hydropeak")
Q_dir <- file.path(tempdir(), "Events2")
outdir <-
get_events_dir(Q_dir, inputsep = ",", inputdec = ".", outdir = outdir)
#> [[1]]
#> NULL
#>
#> [[2]]
#> NULL
list.files(outdir)
#> [1] "200000_4_2020-12-31_2021-01-05.csv" "210000_2_2021-01-01_2021-01-05.csv"
#> [3] "210000_4_2020-12-31_2021-01-05.csv"
The implemented metrics can be used individually. All of these functions take a single event as their first argument, either increasing or decreasing. To use individual metrics, the event data frame has to be converted first using flow()
.
Q[3:4, ]
Q_event <-# decreasing event by 0.001 m^3/s within 15 minutes
Q_event #> ID Time Q
#> 3 200000 01.01.2021 00:30 0.753
#> 4 200000 01.01.2021 00:45 0.752
Using get_events()
for this DC event results in:
get_events(Q_event)
#> ID EVENT_TYPE Time AMP MAFR MEFR DUR RATIO
#> 1 200000 4 2021-01-01 00:30:00 0.001 0.001 0.001 1 1.00133
When using the functions separately, first the data set has to be converted with flow()
:
flow(Q_event)
Q_event <-
Q_event#> ID Time Q
#> 1 200000 2021-01-01 00:30:00 0.753
#> 2 200000 2021-01-01 00:45:00 0.752
The amplitude (AMP, unit: \(m^3/s\)) of an event is defined as the difference between the flow maximum and the flow minimum:
amp(Q_event)
#> [1] 0.001
The maximum flow fluctuation rate (MAFR, unit: \(m^3/s\)) represents the highest absolute flow change of two consecutive time steps within an event.
mafr(Q_event)
#> [1] 0.001
The mean flow fluctuation rate (MEFR, unit: \(m^3/s^2\)) is calculated by the event amplitude divided by the number of time steps (duration) within an event.
mefr(Q_event)
#> [1] 0.001
The duration of an event is specified as the number of consecutive time steps with equal flow trend.
dur(Q_event)
#> [1] 1
The metric flow ratio (RATIO) is defined as the flow maximum divided by the flow minimum.
ratio(Q_event, event_type = event_type(Q_event))
#> [1] 1.00133