The cubble class is an S3 class built on tibble that allows the spatio-temporal data to be wrangled in two forms: a nested/spatial form and a long/temporal form. It consists of two subclasses:
c("spatial_cubble_df", "cubble_df")
c("temporal_cubble_df", "cubble_df")
In a nested cubble, spatial variables are organised as columns and
temporal variables are nested within a specialised ts
column:
cb_nested
#> # cubble: key: id [3], index: date, nested form
#> # spatial: [144.83, -37.98, 145.1, -37.67], Missing CRS!
#> # temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
#> id long lat elev name wmo_id ts
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl> <list>
#> 1 ASN00086038 145. -37.7 78.4 essendon airport 95866 <tibble [10 × 4]>
#> 2 ASN00086077 145. -38.0 12.1 moorabbin airport 94870 <tibble [10 × 4]>
#> 3 ASN00086282 145. -37.7 113. melbourne airport 94866 <tibble [10 × 4]>
class(cb_nested)
#> [1] "spatial_cubble_df" "cubble_df" "tbl_df"
#> [4] "tbl" "data.frame"
This toy dataset is a subset of a larger data
climate_aus
sourced from the Global Historical Climatology
Network Daily (GHCND). It records three airport stations located in
Melbourne, Australia and includes spatial variables such as station ID,
longitude, latitude, elevation, station name, World Meteorology
Organisation ID. The dataset contains temporal variables including
precipitation, maximum and minimum temperature, which can be read from
the cubble header.
In a long cubble, the temporal variables are expanded into the long form, while the spatial variables are stored as a data attribute:
cb_long
#> # cubble: key: id [3], index: date, long form
#> # temporal: 2020-01-01 -- 2020-01-10 [1D], no gaps
#> # spatial: long [dbl], lat [dbl], elev [dbl], name [chr], wmo_id [dbl]
#> id date prcp tmax tmin
#> <chr> <date> <dbl> <dbl> <dbl>
#> 1 ASN00086038 2020-01-01 0 26.8 11
#> 2 ASN00086038 2020-01-02 0 26.3 12.2
#> 3 ASN00086038 2020-01-03 0 34.5 12.7
#> 4 ASN00086038 2020-01-04 0 29.3 18.8
#> 5 ASN00086038 2020-01-05 18 16.1 12.5
#> 6 ASN00086038 2020-01-06 104 17.5 11.1
#> 7 ASN00086038 2020-01-07 14 20.7 12.1
#> 8 ASN00086038 2020-01-08 0 26.4 16.4
#> 9 ASN00086038 2020-01-09 0 33.1 17.4
#> 10 ASN00086038 2020-01-10 0 34 19.6
#> # ℹ 20 more rows
class(cb_long)
#> [1] "temporal_cubble_df" "cubble_df" "tbl_df"
#> [4] "tbl" "data.frame"
The cubble header now shows the recorded temporal period (2020-01-01 to 2020-01-10), the interval (1 day), and there is no gaps in the data.
A cubble object inherits the attributes from tibble (and its
subclasses): class
, row.names
, and
names
. Additionally, it has three specialised
attributes:
key
: the spatial identifierindex
: the temporal identifiercoords
: a pair of ordered coordinates associated with
the locationReaders familiar with the key
and index
attributes from the tsibble
package will already know the
two arguments. In cubble, the key
attribute identifies the
row in the nested cubble, and when combined with the index
argument, it identifies the row in the long cubble. Currently, cubble
only supports one variable as the key, and the accepted temporal classes
for the index include the base R classes Date
,
POSIXlt
, POSIXct
as well as tsibble’s
tsibble::yearmonth()
, tsibble::yearweek()
, and
tsibble::yearquarter()
classes.
The coords
attribute represents an ordered pair of
coordinates. It can be either an unprojected pair of longitude and
latitude or a projected easting and northing value. The sf
package is used under the hood to calculate the bounding box, displayed
in the header of a nested cubble, and perform other spatial
operations.
The long cubble has a special attribute called spatial
to store the spatial variables, which includes all the variables from
the nested cubble except for the ts
column. Below we print
the attributes information for the previously shown
cb_nested
and cb_long
objects:
attributes(cb_nested)
#> $class
#> [1] "spatial_cubble_df" "cubble_df" "tbl_df"
#> [4] "tbl" "data.frame"
#>
#> $row.names
#> [1] 1 2 3
#>
#> $names
#> [1] "id" "long" "lat" "elev" "name" "wmo_id" "ts"
#>
#> $key
#> # A tibble: 3 × 2
#> id .rows
#> <chr> <list<int>>
#> 1 ASN00086038 [1]
#> 2 ASN00086077 [1]
#> 3 ASN00086282 [1]
#>
#> $index
#> [1] "date"
#>
#> $coords
#> [1] "long" "lat"
attributes(cb_long)
#> $class
#> [1] "temporal_cubble_df" "cubble_df" "tbl_df"
#> [4] "tbl" "data.frame"
#>
#> $row.names
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#> [26] 26 27 28 29 30
#>
#> $names
#> [1] "id" "date" "prcp" "tmax" "tmin"
#>
#> $key
#> # A tibble: 3 × 2
#> id .rows
#> <chr> <list<int>>
#> 1 ASN00086038 [10]
#> 2 ASN00086077 [10]
#> 3 ASN00086282 [10]
#>
#> $index
#> [1] "date"
#>
#> $coords
#> [1] "long" "lat"
#>
#> $spatial
#> # A tibble: 3 × 6
#> id long lat elev name wmo_id
#> <chr> <dbl> <dbl> <dbl> <chr> <dbl>
#> 1 ASN00086038 145. -37.7 78.4 essendon airport 95866
#> 2 ASN00086077 145. -38.0 12.1 moorabbin airport 94870
#> 3 ASN00086282 145. -37.7 113. melbourne airport 94866
The following shortcut functions are available to extract components from the attributes:
key_vars()
: the name of the key attribute as a string ,
i.e. "id"
,key_data()
: the tibble object stored in the key
attribute,key()
: the name of the key attribute as a symbol in a
list, i.e. [[1]] id
,index()
: the index attribute as a symbol,
i.e. date
,index_var()
: the index attribute as a string,
i.e. "date"
,coords()
: a character vector of length two representing
the coordinate pairs, i.e. "long" "lat"
, andspatial()
: the tibble object for the spatial
variables.