2. Creation and coercion

This article demonstrates how to create a cubble from various types of data. We will create a cubble from:

Create from separate spatial and temporal tables

In many cases, spatio-temporal data arrive in separate tables for analysis. For example, in climate data, analysts may initially receive station data containing geographic location information, recorded variables and their recording periods. They can then query the temporal variables using the stations of interest to obtain the relevant temporal data. Alternatively, analyses may begin as purely spatial or temporal, and analysts may obtain additional temporal or spatial data to expand the result to spatio-temporal.

The function make_cubble() composes a cubble object from a spatial table (spatial) and a temporal table (temporal), along with the three attributes key, index, and coords introduced in 1. The cubble class. The following code creates the nested cubble:

make_cubble(spatial = stations, temporal = meteo,
            key = id, index = date, coords = c(long, lat))
#> # cubble:   key: id [3], index: date, nested form
#> # spatial:  [144.83, -37.98, 145.1, -37.67], Missing CRS!
#> # temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
#>   id           long   lat  elev name              wmo_id ts               
#>   <chr>       <dbl> <dbl> <dbl> <chr>              <dbl> <list>           
#> 1 ASN00086038  145. -37.7  78.4 essendon airport   95866 <tibble [10 × 4]>
#> 2 ASN00086077  145. -38.0  12.1 moorabbin airport  94870 <tibble [10 × 4]>
#> 3 ASN00086282  145. -37.7 113.  melbourne airport  94866 <tibble [10 × 4]>

The coords argument can be safely omitted if the spatial data is an sf object (e.g. stations_sf) . Similarly, if the temporal object is a tsibble (i.e. meteo_ts), you don’t need to specify the key and index arguments. The class attributes from sf and tsibble will be carried over to the nested and long cubble:

(res <- make_cubble(spatial = stations_sf, temporal = meteo_ts))
#> # cubble:   key: id [3], index: date, nested form, [sf]
#> # spatial:  [144.83, -37.98, 145.1, -37.67], WGS 84
#> # temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
#>   id           elev name   wmo_id  long   lat            geometry ts      
#>   <chr>       <dbl> <chr>   <dbl> <dbl> <dbl>         <POINT [°]> <list>  
#> 1 ASN00086038  78.4 essen…  95866  145. -37.7 (144.9066 -37.7276) <tbl_ts>
#> 2 ASN00086077  12.1 moora…  94870  145. -38.0   (145.0964 -37.98) <tbl_ts>
#> 3 ASN00086282 113.  melbo…  94866  145. -37.7 (144.8321 -37.6655) <tbl_ts>
class(res)
#> [1] "spatial_cubble_df" "cubble_df"         "sf"               
#> [4] "tbl_df"            "tbl"               "data.frame"
class(res$ts[[1]])
#> [1] "tbl_ts"     "tbl_df"     "tbl"        "data.frame"

The vignette 3. Compatibility with tsibble and sf will introduce more on the cubble’s compatibility with tsibble and sf.

Coerce from foreign objects

The tibble objects

The dataset climate_flat combines the spatial data, stations, with the temporal data, meteo, into a single tibble object. It can be coerced into a cubble using:

climate_flat |> as_cubble(key = id, index = date, coords = c(long, lat))
#> # cubble:   key: id [3], index: date, nested form
#> # spatial:  [144.83, -37.98, 145.1, -37.67], Missing CRS!
#> # temporal: date [date], prcp [dbl], tmax [dbl], tmin [dbl]
#>   id           long   lat  elev name              wmo_id ts               
#>   <chr>       <dbl> <dbl> <dbl> <chr>              <dbl> <list>           
#> 1 ASN00086038  145. -37.7  78.4 essendon airport   95866 <tibble [10 × 4]>
#> 2 ASN00086077  145. -38.0  12.1 moorabbin airport  94870 <tibble [10 × 4]>
#> 3 ASN00086282  145. -37.7 113.  melbourne airport  94866 <tibble [10 × 4]>

The NetCDF data

In R, there are several packages available for wrangling NetCDF data, including ncdf4, RNetCDF, and tidync. The code below converts a NetCDF object of class ncdf4 into a cubble object:

path <- system.file("ncdf/era5-pressure.nc", package = "cubble")
raw <- ncdf4::nc_open(path)
as_cubble(raw)
#> # cubble:   key: id [26565], index: time, nested form
#> # spatial:  [113, -53, 153, -12], Missing CRS!
#> # temporal: time [date], q [dbl], z [dbl]
#>       id  long   lat ts              
#>    <int> <dbl> <dbl> <list>          
#>  1     1  113    -12 <tibble [6 × 3]>
#>  2     2  113.   -12 <tibble [6 × 3]>
#>  3     3  114.   -12 <tibble [6 × 3]>
#>  4     4  114.   -12 <tibble [6 × 3]>
#>  5     5  114    -12 <tibble [6 × 3]>
#>  6     6  114.   -12 <tibble [6 × 3]>
#>  7     7  114.   -12 <tibble [6 × 3]>
#>  8     8  115.   -12 <tibble [6 × 3]>
#>  9     9  115    -12 <tibble [6 × 3]>
#> 10    10  115.   -12 <tibble [6 × 3]>
#> # ℹ 26,555 more rows

Sometimes, analysts may choose to read only a subset of the NetCDF data. In such cases, the vars, long_range and lat_range arguments can be used to subset the data based on the variable and the grid resolution:

as_cubble(raw, vars = "q",
          long_range = seq(-180, 180, 1), lat_range = seq(-90, 90, 1))

The stars objects

tif <- system.file("tif/L7_ETMs.tif", package = "stars")
x <- stars::read_stars(tif)
as_cubble(x, index = band)
#> # cubble:   key: id [122848], index: band, nested form
#> # spatial:  [288790.5, 9110743, 298708.5, 9120746.5], SIRGAS 2000 / UTM zone
#> #   25S
#> # temporal: band [int], L7_ETMs.tif [dbl]
#>          x        y    id ts              
#>      <dbl>    <dbl> <int> <list>          
#>  1 288791. 9120747.   352 <tibble [6 × 2]>
#>  2 288819. 9120747.   704 <tibble [6 × 2]>
#>  3 288848. 9120747.  1056 <tibble [6 × 2]>
#>  4 288876. 9120747.  1408 <tibble [6 × 2]>
#>  5 288905. 9120747.  1760 <tibble [6 × 2]>
#>  6 288933. 9120747.  2112 <tibble [6 × 2]>
#>  7 288962. 9120747.  2464 <tibble [6 × 2]>
#>  8 288990. 9120747.  2816 <tibble [6 × 2]>
#>  9 289019. 9120747.  3168 <tibble [6 × 2]>
#> 10 289047. 9120747.  3520 <tibble [6 × 2]>
#> # ℹ 122,838 more rows

When the dimensions object is too complex for the cubble package to handle, a warning message will be generated.

The sftime objects

dt <- climate_flat |> 
  sf::st_as_sf(coords = c("long", "lat"), crs = sf::st_crs("OGC:CRS84")) |> 
  sftime::st_as_sftime()
dt |> as_cubble(key = id, index = date)
#> # cubble:   key: id [3], index: date, nested form, [sf]
#> # spatial:  [144.83, -37.98, 145.1, -37.67], WGS 84
#> # temporal: prcp [dbl], tmax [dbl], tmin [dbl], date [date]
#>   id           elev name   wmo_id            geometry  long   lat ts      
#>   <chr>       <dbl> <chr>   <dbl>         <POINT [°]> <dbl> <dbl> <list>  
#> 1 ASN00086038  78.4 essen…  95866 (144.9066 -37.7276)  145. -37.7 <tibble>
#> 2 ASN00086077  12.1 moora…  94870   (145.0964 -37.98)  145. -38.0 <tibble>
#> 3 ASN00086282 113.  melbo…  94866 (144.8321 -37.6655)  145. -37.7 <tibble>