--- title: "Using FLORAL with phyloseq data" output: rmarkdown::html_vignette: md_extensions: [ "-autolink_bare_uris" ] vignette: > %\VignetteIndexEntry{Using FLORAL with phyloseq data} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} editor_options: chunk_output_type: console --- ```{r setup, include=FALSE} knitr::opts_chunk$set(echo = TRUE) library(tidyverse) library(phyloseq) ``` [Phyloseq](http://joey711.github.io/phyloseq/) is a popular package for working with microbiome data. Here we show how to use the `phy_to_floral_data` helper function to convert phyloseq data into a format accepted by FLORAL. The following code downloads data described in [this paper](https://www.nature.com/articles/s41597-021-00860-8) and turns it into a phyloseq object. The tax_glom step here takes some time, and can be replaced with [`speedyseq::tax_glom`](https://github.com/mikemc/speedyseq) for better performace. ```{r} #this file has duplicate rows, and has multiple rows per pool samples <- read.csv("https://figshare.com/ndownloader/files/33076496") %>% distinct() %>% select(-Pool, -Run, -ShotgunBatchID) %>% distinct() samples <- samples[1:100,] # Using the first 100 samples only. counts <- read.csv("https://figshare.com/ndownloader/files/26393788") counts <- counts %>% filter(SampleID %in% samples$SampleID) taxonomy <- read.csv("https://figshare.com/ndownloader/files/26770997") phy <- phyloseq( sample_data(samples %>% column_to_rownames("SampleID")), tax_table(taxonomy %>% select(ASV, Kingdom:Genus) %>% column_to_rownames("ASV") %>% as.matrix()), otu_table(counts %>% pivot_wider(names_from = "SampleID", values_from = "Count", values_fill = 0) %>% column_to_rownames("ASV") %>% as.matrix(), taxa_are_rows = TRUE) ) %>% subset_samples(DayRelativeToNearestHCT > -30 & DayRelativeToNearestHCT < 0) %>% tax_glom("Genus") ``` Next, we convert that phyloseq object into a list of results to be used by FLORAL; we have to specify the main outcome of interest as `y`, and any metadata columns (from `sample_data(phy)`) to use as covariates. Note that the analysis described here is just an example for using the function; this ```{r} dat <- FLORAL::phy_to_floral_data(phy, covariates=c("Consistency"), y = "DayRelativeToNearestHCT") ``` The resulting list has named entities for the main arguments to FLORAL: ```{r} res <- FLORAL::FLORAL(y = dat$y, x = dat$xcount, ncov = dat$ncov, family = "gaussian", ncv=NULL, progress=FALSE) ```