Working with ICU datasets, especially with publicly available ones as
provided by PhysioNet in R is
facilitated by ricu, which provides data access, a level of
abstraction to encode clinical concepts in a data source agnostic way,
as well as classes and utilities for working with the arising types of
time series datasets.
To cite ricu, please use the following:
@article{bennett2023ricu,
  title={ricu: R’s interface to intensive care data},
  author={Bennett, Nicolas and Ple{\v{c}}ko, Drago and Ukor, Ida-Fong and Meinshausen, Nicolai and B{\"u}hlmann, Peter},
  journal={GigaScience},
  volume={12},
  pages={giad041},
  year={2023},
  publisher={Oxford University Press}
}Currently, installation is only possible from github directly, using
the remotes if installed
remotes::install_github("eth-mds/ricu")or by sourcing the required code for installation from github by running
rem <- source(
  paste0("https://raw.githubusercontent.com/r-lib/remotes/main/",
         "install-github.R")
)
rem$value("eth-mds/ricu")In order to make sure that some useful utility packages are installed
as well, consider installing the packages marked as
Suggests as well by running
remotes::install_github("eth-mds/ricu", dependencies = TRUE)instead, or by installing some of the utility packages (relevant for downloading and preprocessing PhysioNet datasets)
install.packages("xml2")and demo dataset packages
install.packages(c("mimic.demo", "eicu.demo"),
                 repos = "https://eth-mds.github.io/physionet-demo")explicitly.
Out of the box (provided the two data packages
mimic.demo and eicu.demo are available),
ricu provides access to the demo datasets corresponding to
the PhysioNet Clinical Databases eICU and MIMIC-III. Tables are
available as
mimic_demo$admissions#> # <mimic_tbl>: [129 ✖ 19]
#> # ID options:  subject_id (patient) < hadm_id (hadm) < icustay_id (icustay)
#> # Defaults:    `admission_type` (val)
#> # Time vars:   `admittime`, `dischtime`, `deathtime`, `edregtime`, `edouttime`
#>     row_id subject_id hadm_id admittime           dischtime
#>      <int>      <int>   <int> <dttm>              <dttm>
#> 1    12258      10006  142345 2164-10-23 21:09:00 2164-11-01 17:15:00
#> 2    12263      10011  105331 2126-08-14 22:32:00 2126-08-28 18:59:00
#> 3    12265      10013  165520 2125-10-04 23:36:00 2125-10-07 15:13:00
#> 4    12269      10017  199207 2149-05-26 17:19:00 2149-06-03 18:42:00
#> 5    12270      10019  177759 2163-05-14 20:43:00 2163-05-15 12:00:00
#> …
#> 125  41055      44083  198330 2112-05-28 15:45:00 2112-06-07 16:50:00
#> 126  41070      44154  174245 2178-05-14 20:29:00 2178-05-15 09:45:00
#> 127  41087      44212  163189 2123-11-24 14:14:00 2123-12-30 14:31:00
#> 128  41090      44222  192189 2180-07-19 06:55:00 2180-07-20 13:00:00
#> 129  41092      44228  103379 2170-12-15 03:14:00 2170-12-24 18:00:00
#> # ℹ 124 more rows
#> # ℹ 14 more variables: deathtime <dttm>, admission_type <chr>,
#> #   admission_location <chr>, discharge_location <chr>, insurance <chr>,
#> #   language <chr>, religion <chr>, marital_status <chr>, ethnicity <chr>,
#> #   edregtime <dttm>, edouttime <dttm>, diagnosis <chr>,
#> #   hospital_expire_flag <int>, has_chartevents_data <int>
and data can be loaded into an R session for example using
load_ts("labevents", "mimic_demo", itemid == 50862L,
        cols = c("valuenum", "valueuom"))#> # A `ts_tbl`: 299 ✖ 4
#> # Id var:     `icustay_id`
#> # Index var:  `charttime` (1 hours)
#>     icustay_id charttime valuenum valueuom
#>          <int> <drtn>       <dbl> <chr>
#> 1       201006   0 hours      2.4 g/dL
#> 2       203766 -18 hours      2   g/dL
#> 3       203766   4 hours      1.7 g/dL
#> 4       204132   7 hours      3.6 g/dL
#> 5       204201   9 hours      2.3 g/dL
#> …
#> 295     298685 130 hours      1.9 g/dL
#> 296     298685 154 hours      2   g/dL
#> 297     298685 203 hours      2   g/dL
#> 298     298685 272 hours      2.2 g/dL
#> 299     298685 299 hours      2.5 g/dL
#> # ℹ 294 more rows
which returns time series data as ts_tbl object.
This work was supported by grant #2017-110 of the Strategic Focal Area “Personalized Health and Related Technologies (PHRT)” of the ETH Domain for the SPHN/PHRT Driver Project “Personalized Swiss Sepsis Study”.