--- title: "Introduction to Quantile-on-Quantile Regression" author: "Dr. Merwan Roudane" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Introduction to Quantile-on-Quantile Regression} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.width = 7, fig.height = 5 ) ``` ## Overview The **QuantileOnQuantile** package implements the Quantile-on-Quantile (QQ) regression methodology developed by Sim and Zhou (2015). This approach estimates the effect that quantiles of one variable have on quantiles of another, capturing the dependence between their distributions. ### Why Quantile-on-Quantile Regression? Traditional regression methods like OLS estimate the effect of independent variables on the *conditional mean* of the dependent variable. Quantile regression extends this by estimating effects on *conditional quantiles*. However, both approaches treat the independent variable as a single entity, ignoring the possibility that the relationship may vary depending on whether the independent variable takes extreme or moderate values. The QQ approach addresses this limitation by: 1. Modeling the **quantile of Y** (the dependent variable) as the outcome 2. Examining the effect of different **quantiles of X** (the independent variable) 3. Producing coefficients indexed by both $\theta$ (quantile of Y) and $\tau$ (quantile of X) This allows researchers to ask questions like: - Do large positive shocks in X affect Y differently than large negative shocks? - Does the relationship between X and Y depend on market conditions (bull vs bear)? - Is the dependence between variables stronger in the tails of their distributions? ### The Original Application: Oil Prices and Stock Returns Sim and Zhou (2015) applied this methodology to examine the relationship between oil price shocks and US stock returns. They found that: 1. Large negative oil price shocks (low $\tau$) can positively affect US equities when the stock market is performing well (high $\theta$) 2. Positive oil price shocks have weak effects regardless of market conditions 3. The relationship is asymmetric and depends on both the nature of oil shocks and market state ## Installation ```{r, eval=FALSE} # Install from CRAN (when available) install.packages("QuantileOnQuantile") # Install from GitHub # install.packages("devtools") devtools::install_github("merwanroudane/qq") ``` ## Quick Start ### Basic Usage ```{r basic_example} library(QuantileOnQuantile) # Generate example data set.seed(42) n <- 300 x <- rnorm(n) y <- 0.5 * x + 0.3 * x * (x < 0) + rnorm(n, sd = 0.5) # Asymmetric relationship # Run QQ regression result <- qq_regression(y, x, verbose = FALSE) # Print summary print(result) ``` ### Summary Statistics ```{r statistics} # Get detailed summary summary(result) # Get statistics as data frame stats <- qq_statistics(result) print(stats) ``` ## Visualization The package provides several interactive visualization options using plotly. ### 3D Surface Plot The 3D surface plot is the signature visualization of the QQ approach, showing how coefficients vary across both dimensions. ```{r 3d_plot, eval=FALSE} # Coefficient surface with MATLAB-style Jet colorscale plot_qq_3d(result, type = "coefficient", colorscale = "Jet") # R-squared surface with Viridis colorscale plot_qq_3d(result, type = "rsquared", colorscale = "Viridis") # P-value surface plot_qq_3d(result, type = "pvalue", colorscale = "Plasma") ``` ### Available Color Scales The package supports several color scales: ```{r colorscales} qq_colorscales() ``` - **Jet**: MATLAB-style rainbow (blue -> cyan -> green -> yellow -> red) - **BlueRed**: Diverging scale, useful for coefficients centered around zero - **Viridis**: Perceptually uniform, colorblind-friendly - **Plasma**: High contrast, perceptually uniform ### Heatmaps Heatmaps provide a 2D view of the results: ```{r heatmap, eval=FALSE} # Coefficient heatmap plot_qq_heatmap(result, type = "coefficient", colorscale = "Viridis") # R-squared heatmap plot_qq_heatmap(result, type = "rsquared", colorscale = "Plasma") # P-value heatmap plot_qq_heatmap(result, type = "pvalue", colorscale = "Jet") ``` ### Contour Plot Contour plots show level curves of the coefficient surface: ```{r contour, eval=FALSE} plot_qq_contour(result, colorscale = "Jet", show_labels = TRUE) ``` ### Quantile Correlation The correlation heatmap shows the relationship between quantiles of both variables: ```{r correlation, eval=FALSE} plot_qq_correlation(y, x, quantiles = seq(0.1, 0.9, by = 0.1)) ``` ## Detailed Example: Simulating Asymmetric Relationships Let's create data that mimics the oil-stock relationship from the original paper: ```{r detailed_example} set.seed(2015) n <- 500 # Generate "oil shocks" oil_shock <- rnorm(n) # Generate "stock returns" with asymmetric response stock_return <- numeric(n) for (i in 1:n) { # Base return base_return <- 0.01 # Asymmetric effect if (oil_shock[i] < quantile(oil_shock, 0.3)) { # Large negative oil shocks have positive effect effect <- -0.02 * oil_shock[i] } else if (oil_shock[i] > quantile(oil_shock, 0.7)) { # Large positive oil shocks have weak effect effect <- -0.005 * oil_shock[i] } else { # Moderate shocks have little effect effect <- -0.001 * oil_shock[i] } stock_return[i] <- base_return + effect + rnorm(1, sd = 0.04) } # Run QQ regression with finer grid result_oil <- qq_regression( y = stock_return, x = oil_shock, y_quantiles = seq(0.1, 0.9, by = 0.1), x_quantiles = seq(0.1, 0.9, by = 0.1), verbose = FALSE ) # Print summary print(result_oil) ``` ## Working with Results ### Extracting Results The results are stored in a data frame: ```{r extract_results} # Access raw results head(result_oil$results) # Convert to matrix format coef_matrix <- qq_to_matrix(result_oil, type = "coefficient") print(round(coef_matrix, 4)) ``` ### Exporting Results ```{r export, eval=FALSE} # Export to CSV qq_export(result_oil, file.path(tempdir(), "qq_results.csv")) ``` ## Customizing Quantile Grids You can customize the quantile grid for more or less granularity: ```{r custom_grid} # Coarse grid (faster computation) result_coarse <- qq_regression( y = stock_return, x = oil_shock, y_quantiles = seq(0.2, 0.8, by = 0.2), x_quantiles = seq(0.2, 0.8, by = 0.2), verbose = FALSE ) # Fine grid (more detail, slower) result_fine <- qq_regression( y = stock_return, x = oil_shock, y_quantiles = seq(0.05, 0.95, by = 0.05), x_quantiles = seq(0.05, 0.95, by = 0.05), verbose = FALSE ) cat("Coarse grid combinations:", nrow(result_coarse$results), "\n") cat("Fine grid combinations:", nrow(result_fine$results), "\n") ``` ## Methodology Details ### The QQ Model The QQ approach is based on the following model: $$r_t^\theta = \beta^\theta(Oil_t) + \alpha^\theta r_{t-1} + v_t^\theta$$ where $r_t^\theta$ is the $\theta$-quantile of the return and $\beta^\theta(\cdot)$ is an unknown function. Taking a Taylor expansion around the $\tau$-quantile of oil shocks ($Oil^\tau$): $$\beta^\theta(Oil_t) \approx \beta_0(\theta, \tau) + \beta_1(\theta, \tau)(Oil_t - Oil^\tau)$$ The key insight is that $\beta_0(\theta, \tau)$ and $\beta_1(\theta, \tau)$ are *doubly indexed* by $\theta$ and $\tau$, capturing the dependence between both distributions. ### Estimation The estimation proceeds by: 1. For each $\tau$ (quantile of X): subset data where X <= quantile(X, $\tau$) 2. For each $\theta$ (quantile of Y): perform quantile regression 3. Extract coefficients and compute pseudo R-squared The pseudo R-squared is computed as: $$R^2 = 1 - \frac{\sum \rho_\theta(y - \hat{y})}{\sum \rho_\theta(y - Q_\theta(y))}$$ where $\rho_\theta(u) = u(\theta - I(u < 0))$ is the check function. ## Comparison with Standard Methods ### OLS vs Quantile Regression vs QQ | Method | What it estimates | Captures heterogeneity in... | |--------|------------------|------------------------------| | OLS | E[Y|X] | None (constant effect) | | Quantile Regression | Q_theta[Y|X] | Y distribution | | QQ Regression | Q_theta[Y|X_tau] | Both Y and X distributions | ### When to Use QQ Regression Use QQ regression when you suspect that: 1. The effect of X on Y varies across the distribution of Y (e.g., bull vs bear markets) 2. The effect differs for large vs small values of X (e.g., large vs small shocks) 3. There is asymmetry (e.g., positive vs negative shocks have different effects) 4. You want to understand the complete dependence structure between two variables ## References Sim, N. and Zhou, H. (2015). Oil Prices, US Stock Return, and the Dependence Between Their Quantiles. *Journal of Banking & Finance*, 55, 1-12. doi:10.1016/j.jbankfin.2015.01.013 Koenker, R. (2005). *Quantile Regression*. Cambridge University Press. Koenker, R. and Xiao, Z. (2006). Quantile Autoregression. *Journal of the American Statistical Association*, 101, 980-990. ## Session Info ```{r session_info} sessionInfo() ```