--- title: "Penalized Semiparametric Bayesian Cox Models" output: rmarkdown::html_vignette vignette: > %\VignetteEngine{knitr::rmarkdown} %\VignetteIndexEntry{Penalized Semiparametric Bayesian Cox Models} \usepackage[utf8]{inputenc} --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE, eval = FALSE) options(rmarkdown.html_vignette.check_title = FALSE) ``` This is a C++ speed-up and extended version of the R-package [psbcGroup](https://CRAN.R-project.org/package=psbcGroup). It implements the Bayesian Lasso Cox model ([Lee et al., 2011](https://doi.org/10.2202/1557-4679.1301)) and the Bayesian Lasso Cox with mandatory variables ([Zucknick et al., 2015](https://doi.org/10.1002/bimj.201400160)). Bayesian Lasso Cox models with other shrinkage and group priors ([Lee et al., 2015](https://doi.org/10.1002/sam.11266)) are to be implemented later on. ## Installation Install the latest released version from [CRAN](https://CRAN.R-project.org/package=psbcSpeedUp) ```{r install1, eval = FALSE} install.packages("psbcSpeedUp") ``` Install the latest development version from GitHub ```{r install2, eval = FALSE} # install.packages("remotes") remotes::install_github("ocbe-uio/psbcSpeedUp") ``` ## Examples ### Run a Bayesian Lasso Cox with mandatory variables Data set `exampleData` consists of six components: survival times `t`, event status `di`, covariates `x`, number of genomics variables `p`, number of clinical variables `q` and true effects of covariates `beta_true`. See `?exampleData` for more information of the data. To run a Bayesian Lasso Cox model for variable selection of the first $p$ genomics variables and inclusion of $q$ mandatory variables, one can specify arguments of the main function `psbcSpeedUp()` with `p = p` and `q = q`. If the arguments `p` and `q` are unspecified, the Bayesian Lasso Cox model does variable selection for all covariates, i.e., by default `p = ncol(survObj$x)` and `q = 0`. ```{r, results='hide', warning=FALSE} # Load the example dataset data("exampleData", package = "psbcSpeedUp") p <- exampleData$p q <- exampleData$q survObj <- exampleData[1:3] # Set hyperparameters (see help file for specifying more hyperparameters) mypriorPara <- list( "eta0" = 0.02, "kappa0" = 1, "c0" = 2, "r" = 10 / 9, "delta" = 1e-05, "lambdaSq" = 1, "sigmaSq" = runif(1, 0.1, 10), "beta.prop.var" = 1, "beta.clin.var" = 1 ) # run Bayesian Lasso Cox library("psbcSpeedUp") set.seed(123) fitBayesCox <- psbcSpeedUp(survObj, p = p, q = q, hyperpar = mypriorPara, nIter = 1000, burnin = 500, outFilePath = "/tmp" ) ``` ### Plot posterior estimates of regression coefficients The function `psbcSpeedUp::plot()` can show the posterior mean and 95% credible intervals of regression coefficients. ```{r, fig.width=5, fig.height=8} plot(fitBayesCox) ``` ```{r eval=FALSE, echo=FALSE} png("estimate_beta.png", bg = "transparent", width = 700, height = 900, res = 200) plot(fitBayesCox) dev.off() ``` ### Plot time-dependent Brier scores The function `psbcSpeedUp::plotBrier()` can show the time-dependent Brier scores based on posterior mean of coefficients or Bayesian model averaging. ```{r, fig.width=6, fig.heigh=5} plotBrier(fitBayesCox, times = 80) ``` ```{r eval=FALSE, echo=FALSE} png("estimate_brier.png", bg = "transparent", width = 1000, height = 700, res = 200) plotBrier(fitBayesCox, times = 80) dev.off() ``` ``` ## Null.model Bayesian.Cox ## IBS 0.2089742 0.08276646 ``` ### Predict survival probabilities and cumulative hazards The function `psbcSpeedUp::predict()` can estimate the survival probabilities and cumulative hazards. ```{r} predict(fitBayesCox, type = c("cumhazard", "survival")) ``` ``` ## observation times cumhazard survival ## ## 1: 1 0.264 8.39e-06 1.00e+00 ## 2: 2 0.264 4.09e-05 1.00e+00 ## 3: 3 0.264 4.84e-05 1.00e+00 ## 4: 4 0.264 2.44e-05 1.00e+00 ## 5: 5 0.264 7.22e-05 1.00e+00 ## --- ## 39996: 196 107.641 2.18e+00 1.14e-01 ## 39997: 197 107.641 6.28e-01 5.34e-01 ## 39998: 198 107.641 4.95e+01 3.15e-22 ## 39999: 199 107.641 4.17e+02 9.71e-182 ## 40000: 200 107.641 3.00e-01 7.41e-01 ``` ## References > Kyu Ha Lee, Sounak Chakraborty, Jianguo Sun (2011). > Bayesian variable selection in semiparametric proportional hazards model for high dimensional survival data. > _The International Journal of Biostatistics_, 7:1. DOI: [10.2202/1557-4679.1301](https://doi.org/10.2202/1557-4679.1301). > Kyu Ha Lee, Sounak Chakraborty, Jianguo Sun (2015). > Survival prediction and variable selection with simultaneous shrinkage and grouping priors. > _Statistical Analysis and Data Mining_, 8:114-127. DOI:[10.1002/sam.11266](https://doi.org/10.1002/sam.11266). > Manuela Zucknick, Maral Saadati, Axel Benner (2015). > Nonidentical twins: Comparison of frequentist and Bayesian lasso for Cox models. > _Biometrical Journal_, 57:959-981. DOI:[10.1002/bimj.201400160](https://doi.org/10.1002/bimj.201400160).