--- title: "Fitted-Model-Based Annotations :: Cheat Sheet" subtitle: "'ggpmisc' `r packageVersion('ggpmisc')`" author: "Pedro J. Aphalo" date: "`r Sys.Date()`" output: rmarkdown::html_vignette: toc: yes vignette: > %\VignetteIndexEntry{Fitted-Model-Based Annotations :: Cheat Sheet} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- ## Basics **ggpmisc** is based on the **grammar of graphics** implemented in **ggplot2**, the idea that you can build every graph from the same components: a **data** set, a **coordinate system**, and **geoms**---visual marks that represent data points. If you are not already familiar with this grammar and **ggplot2** you should visit the [**ggplot2** Cheat Sheet](https://rstudio.github.io/cheatsheets/html/data-visualization.html) first, and afterwards come back to this Cheat Sheet. Differently to **ggplot2**, no geometries with the new stats as default are provided. The plot layers described here are always added with a _stat_, and when necessary, their default `geom` argument can be overridden. The default _geoms_ for the statistics described below are from packages **ggplot2** and **ggpp**. ```{r, eval=FALSE} library(ggpmisc) ``` Most of the layer functions in **ggpmisc** aim at making it easier to add to plots information derived from model fitting, tests of significance and some summaries. All layer functions work as expected with groups and facets. ## Correlation * `stat_correlation()` computes parametric $r$ or non-parametric correlation coefficients, $\tau$ and $\rho$, and optionally their confidence intervals, $P$, and $n$, the number of observations, flexibly adding an annotation to the plot. ## Fitted models The statistics for fitted models come in matched pairs, one that adds a plot layer with one or more curves and confidence band(s), and one that annotates the plot with the fitted model equation and/or other parameter estimates. These depend on the type of fitted model and include $R^2$, $F$, $P$, $AIC$, $BIC$, and $n$. The curve plotting stats are similar to `ggplot2::stat_smooth()` but the ones for textual annotations have no equivalent in 'ggplot2'. * `stat_poly_line()` and `stat_poly_line()` are the pair supporting a broader set of model fit functions: e.g., linear models (OLS, resistant and robust), linear splines, general linear model (gls), major axis (MA) and standardised major axis (SMA) regression, etc. * `stat_quant_line()`, `stat_quant_band()` and `stat_quant_eq()` support quantile regression (using 'quantreg'). * `stat_ma_line()` and `stat_ma_eq()` support major axis (MA), standardised major axis (SMA) and ranged major axis (RMA) regression (using 'lmodel2'). * `stat_fit_augment()` works with model fit functions supported by `broom::augment()` methods including non-linear models. * `stat_fit_tidy()` works with model fit functions supported by `broom::tidy()` methods including non-linear models. * `stat_fit_fitted()` and `stat_fit_deviations()` can be used to highlight the fitted values and their distance to the observations in a scatterplot in combination with the statistics above. * `stat_fit_residuals()` can be used to create consistent plots of residuals for many different model fit functions. * `stat_distrmix_line()` and `stat_distrmix_eq()` support univariate Normal distribution mixture models. ## ANOVA or summary tables * `stat_fit_tb()` fits any model supported by a `broom::tidy()` method. Adds an ANOVA or Summary table. Which columns are included and their naming can be set by the user. ## Multiple comparisons * `stat_multcomp()` fits a model, computes ANOVA and subsequently calls functions from package 'multcomp' to test the significance of Tukey, Dunnet or arbitrary sets of pairwise contrasts, with a choice of the adjustment method for the _P_-values. Significance of differences can be indicated with letters, asterisks or _P_-values. Sizes of differences are also computed and available for user-assembled labels. ## Peaks and valleys * `stat_peaks()` finds and labels peaks (= global or local maxima). * `stat_valleys()` finds and labels valleys (= global or local minima). ## Volcano and quadrant plots These plots are frequently used with gene expression data, and each of the many genes labelled based on the ternary outcome from a statistical test. Data are usually, in addition transformed. 'ggpmisc' provides several variations on continuous, colour, fill and shape scales, with defaults set as needed. Scales support log fold-change (`logFC`), false discovery ratio (`FDR`), _P_-value (`Pvalue`) and binary or ternary test outcomes (`outcome`). ## Utility functions Most of the functions used to generate formatted labels in layers and scales are also exported. ------------------------------------------------------------------------ Learn more at [docs.r4photobiology.info/ggpmisc/](https://docs.r4photobiology.info/ggpmisc/). ------------------------------------------------------------------------