--- title: "Modelling Psychometric Data with Sinh-Arcsinh Distributions" author: "Wolfgang Lenhard & Alexandra Lenhard" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Modelling Psychometric Data with Sinh-Arcsinh Distributions} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set(tidy.opts=list(width.cutoff=60), collapse = TRUE, comment = "#>", eval = TRUE # Disabled due to computational intensity ) ``` # Mathematical Foundation ## The Sinh-Arcsinh Transformation The ShaSh distribution is defined by a transformation of a standard normal variable. If $Y$ follows a standard normal distribution, $Y \sim N(0,1)$, then the ShaSh-distributed variable $X$ is generated by: $$X = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(Y) - \epsilon}{\delta}\right)$$ This transformation allows the resulting variable $X$ to have its location, scale, skewness, and tail weight controlled by the four parameters: - **μ** (mu): location parameter, shifts the distribution horizontally (similar to mean) - **σ** (sigma): scale parameter ($\sigma > 0$), controls the spread of the distribution (similar to standard deviation) - **ε** (epsilon): skewness parameter ($\epsilon = 0$ for symmetry, $\epsilon > 0$ for right skew, $\epsilon < 0$ for left skew) - **δ** (delta): tail weight parameter ($\delta = 1$ produces normal-like tails, $\delta > 1$ produces heavier tails, $\delta < 1$ produces lighter tails) The probability density function involves: $$f(x|\mu,\sigma,\epsilon,\delta) = \frac{\delta}{\sigma\sqrt{2\pi}} \cdot \frac{\cosh(\delta \cdot \text{arcsinh}(z) + \epsilon)}{\sqrt{1 + z^2}} \cdot \exp\left(-\frac{1}{2}[\sinh(\delta \cdot \text{arcsinh}(z) + \epsilon)]^2\right)$$ where $z = \frac{x - \mu}{\sigma}$. The cumulative distribution function (CDF) does not have a closed-form expression but can be computed numerically. For a given value $x$, the CDF is: $$F(x|\mu,\sigma,\epsilon,\delta) = P(X \leq x) = \Phi[\sinh(\delta \cdot \text{arcsinh}(z) + \epsilon)]$$ where $\Phi$ is the standard normal CDF and $z = (x - \mu)/\sigma$. The quantile function (inverse CDF) can be expressed as: $$Q(p|\mu,\sigma,\epsilon,\delta) = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(\Phi^{-1}(p)) - \epsilon}{\delta}\right)$$ ## Modeling the Sinh-Arcsinh over Age In cNORM's ShaSh implementation, we model the four parameters as polynomial functions of standardized age. The tail weight parameter δ is held constant across ages in the default setting. It can be adjusted to reflect population characteristics, e. g. by increasing it to values delta > 1 for heterogenous samples or delta < 1 for homogenuous samples. By setting the delta_degree parameter, δ is as well modeled as a polynomial across age. If used, it is advisable to keep the delta_degree parameter low, for example 1 or 2, to avoid overfitting. Age is standardized as: $\text{age}_{std} = \frac{\text{age} - \overline{\text{age}}}{\text{SD}(\text{age})}$ for numerical stability during optimization. Parameters are estimated using maximum likelihood estimation. # Advantages and Applications ## Key Advantages **1. Distributional Flexibility**: The ShaSh distribution can model symmetric and asymmetric distributions with varying tail weights, making it suitable for diverse psychometric applications. **2. Continuous Scores**: Unlike discrete distributions (e.g., beta-binomial), ShaSh naturally handles continuous scores, decimal values, and unbounded measures. **3. Zero and Negative Values**: ShaSh can accommodate scores of zero and negative values without transformation, unlike Box-Cox approaches that require strictly positive data. **4. Independent Parameter Control**: Skewness (ε) and tail weight (δ) can be controlled independently, allowing precise modeling of distributional characteristics. ## Optimal Applications The ShaSh distribution is particularly well-suited for: - **Continuous performance measures** (reaction times, achievement scores with decimal precision) - **Tests with floor or ceiling effects** where skewness varies across age groups - **Heterogeneous populations** with varying degrees of individual differences - **Change or difference scores** that may include negative values - **Adaptive tests** or measures with variable stopping rules - **Large-scale assessments** where distributional assumptions are complex - **Developmental studies** where distribution shape changes systematically with age # Prerequisites and Considerations ## Data Requirements **Systematic Development**: Test scores should exhibit systematic (though not necessarily monotonic) development across the predictor variable. The ShaSh model can capture complex non-linear relationships through polynomial terms. **Sufficient Sample Size**: We recommend a minimum of 100 cases per major age group, with larger samples needed for higher polynomial degrees or complex developmental patterns. While empirical validation of this rule is ongoing, these sample sizes have proven effective in previous norming studies (Lenhard et al., 2019). **Score Characteristics**: While ShaSh handles continuous scores optimally, it can also model discrete scores effectively, especially when the number of possible values is large (> 20). **Representativeness**: The sample should be representative of the target population, though post-stratification weighting can help address moderate deviations from representativeness. ## Model Selection Considerations **Polynomial Degrees**: Choose polynomial degrees based on: - Theoretical expectations about developmental trajectories - Sample size (higher degrees require more data to avoid overfitting) - Model comparison criteria (AIC, BIC, cross-validation) - Visual inspection of fitted curves for realistic patterns **Delta Parameter Selection**: The tail weight parameter should reflect population characteristics: - δ = 0.7-0.9: Homogeneous populations, selected samples - δ = 1.0: General population samples with normal-like variability - δ = 1.2-2.0: Heterogeneous populations with high individual differences If the delta_degree parameter is used, keep it low (1 or 2) to avoid overfitting. In that case, the default δ is not used. **Convergence Considerations**: Complex models may require adjusted optimization parameters. The function includes automatic fallback strategies for difficult convergence cases. # Modeling Example We demonstrate ShaSh modeling using the PPVT-4 vocabulary development dataset, which showcases the distribution's ability to handle complex developmental patterns and distributional characteristics typical of psychometric data. The code examples are provided for reference but are not executed in this vignette due to computational intensity. ## Data Exploration ```{r fig0, fig.height = 4, fig.width = 7} library(cNORM) # Examine the data structure and distribution str(ppvt) plot(ppvt$age, ppvt$raw, main="PPVT Raw Scores by Age", xlab="Age", ylab="Raw Score", pch=16, col=rgb(0,0,0,0.3)) # Examine distributional characteristics hist(ppvt$raw, breaks=30, main="Distribution of Raw Scores", xlab="Raw Score", probability=TRUE, col="lightblue") ``` The vocabulary development data exhibits characteristic curvilinear growth patterns with potential distributional changes across age groups. The raw score distribution may show skewness or varying tail weights, making it an ideal candidate for ShaSh modeling. ## Basic Model Fitting ```{r fig1, fig.height = 4, fig.width = 7} # Fit ShaSh model with default settings # Default: mu_degree=3, sigma_degree=2, epsilon_degree=2, delta=1 model.shash <- cnorm.shash(age = ppvt$age, score = ppvt$raw) # The function automatically displays percentile plots print(model.shash) ``` The model uses reasonable default polynomial degrees and provides immediate visual feedback through percentile plots. The output includes fitted parameters, convergence information, and basic model statistics. ## Model Diagnostics ```{r} # Obtain comprehensive diagnostics summary(model.shash, age = ppvt$age, score = ppvt$raw) ``` The summary provides: - Model fit statistics (log-likelihood, AIC, BIC, R²) - Parameter estimates with standard errors and significance tests - Convergence diagnostics - Separate tables for location, scale, and skewness parameters Key diagnostic indicators: - **Convergence**: Should be successful (code = 0). If not, please inspect modell visually. - **R²**: Correlation between fitted and manifest percentiles (> 0.95 desirable) - **Parameter significance**: Check z-values and p-values for parameter importance ## Custom Model Specifications For datasets with complex patterns, adjust polynomial degrees and distributional parameters: ```{r fig2, fig.height = 4, fig.width = 7} # Example with more complex parameterization (not executed due to computation time) model.custom <- cnorm.shash(age = ppvt$age, score = ppvt$raw, mu_degree = 3, # Curvelinear pattern sigma_degree = 3, # Complex variability changes epsilon_degree = 2, # Age-varying skewness delta_degree = 2) # Changing tail weights across age # Compare models compare(model.shash, model.custom, age = ppvt$age, score = ppvt$raw, title = "ShaSh Model Comparison") ``` **Note**: Higher polynomial degrees increase model flexibility but may lead to overfitting with insufficient data. Always validate complex models through visual inspection and cross-validation. ## Simplified Alternative ```{r fig3, fig.height = 4, fig.width = 7} # More conservative parameterization for demonstration model.simple <- cnorm.shash(age = ppvt$age, score = ppvt$raw, mu_degree = 2, # Quadratic location pattern sigma_degree = 1, # Linear variability change epsilon_degree = 1, # Linear skewness change delta = 1.1) # Slightly heavy tails # This model balances flexibility with stability ``` ## Post-Stratification and Weighting ShaSh models support weighted estimation for representative sampling: ```{r fig4, fig.height = 4, fig.width = 7} # Calculate post-stratification weights margins <- data.frame(variables = c("sex", "sex", "migration", "migration"), levels = c(1, 2, 0, 1), share = c(.52, .48, .7, .3)) weights <- computeWeights(ppvt, margins) # Fit weighted ShaSh model model.weighted <- cnorm.shash(ppvt$age, ppvt$raw, weights = weights) # Compare weighted vs. unweighted compare(model.shash, model.weighted, age = ppvt$age, score = ppvt$raw, title = "Unweighted vs. Weighted ShaSh Models") ``` # Norm Score Generation ## Individual Predictions ```{r} # Generate norm scores for specific age-score combinations ages <- c(10.25, 10.75, 11.25, 11.75) raw_scores <- c(180, 185, 190, 195) norm_scores <- predict(model.shash, ages, raw_scores) prediction_table <- data.frame( Age = ages, Raw_Score = raw_scores, Norm_Score = round(norm_scores, 1) ) print(prediction_table) ``` ## Comprehensive Norm Tables ```{r} # Generate detailed norm tables for multiple ages tables <- normTable.shash(model.shash, ages = c(10.25, 10.75), start = 150, end = 220, step = 1, CI = 0.95, reliability = 0.94) # Display sample from first table head(tables[[1]], 10) ``` The norm tables provide: - **x**: Raw scores - **Px**: Probability density values - **Pcum**: Cumulative probabilities - **Percentile**: Percentile ranks (0-100) - **z**: Standardized z-scores - **norm**: Norm scores in specified scale - **Confidence intervals**: When reliability is specified # Model Comparison and Selection ## Comparing Distributional Approaches ```{r fig5, fig.height = 4, fig.width = 7} # Compare ShaSh with other approaches # Beta-Binomial models should work worse, since the test has stop rules, # leading to non-binomial distributions. Distribution-free models (Taylor) # should be able to model the data flexibly as well. model.bb <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228) model.taylor <- cnorm(group = ppvt$group, raw = ppvt$raw) # Visual comparisons compare(model.shash, model.bb, age = ppvt$age, score = ppvt$raw, title = "ShaSh vs. Beta-Binomial") compare(model.taylor, model.shash, age = ppvt$age, score = ppvt$raw, title = "Distribution-Free vs. ShaSh") ``` ## Decision Framework The different continuous models have advantages for specific use cases, with the distribution free approach (Taylor polynomials) being the most flexible, the beta-binomial approach being optimal for discrete item counts, and ShaSh being ideal for continuous scores with complex distributional shapes. Use ShaSh models for - continuous scores with decimal precision - complex skewness patterns that change across age - floor or ceiling effects requiring flexible shape modeling - zero or negative scores are present Use Beta-Binomial models for - discrete item counts from binary scoring - fixed maximum score (number of test items) - unspeeded tests with homogeneous item difficulties - 1PL IRT-based psychometric instruments - small to moderate number of possible scores Use distribution-free (Taylor Polynomial) models when - maximum flexibility is paramount - no strong distributional assumptions can be made - quick implementation is needed and manual adjustment is desired In selecting models, please compare models using: 1. Information criteria: AIC, BIC (lower is better) 2. Fit statistics: R², RMSE, bias 3. Visual inspection: Smoothness, realism of percentile curves 4. Cross-validation: Out-of-sample prediction accuracy 5. Theoretical appropriateness: Match between model assumptions and data characteristics ## Computational Considerations ### Performance Tips 1. **Start simple**: Use default parameters initially, then increase complexity as needed 2. **Monitor convergence**: Check convergence codes and adjust control parameters if necessary 3. **Validate visually**: Always inspect percentile plots for realistic patterns 4. **Use appropriate sample sizes**: Larger samples for higher polynomial degrees 5. **Consider data preprocessing**: Remove extreme outliers that might affect convergence ### Troubleshooting **Convergence Issues and Numerical Instability:** - Reduce polynomial degrees - Adjust delta parameter - Use simpler control parameters - Check for data quality issues - Pay attention to warning messages during optimization **Overfitting Signs:** - Wavy, unrealistic percentile curves - Very high polynomial degrees relative to sample size - Poor out-of-sample prediction # Conclusion The Sinh-Arcsinh distribution provides a sophisticated yet practical framework for modeling psychometric norm data. Its ability to independently control location, scale, skewness, and tail weight makes it particularly valuable for applications where traditional parametric approaches are inadequate due to distributional complexity. **Best practices for ShaSh modeling:** 1. Carefully assess data characteristics before model selection 2. Use appropriate sample sizes for model complexity 3. Validate models through visually and via statistical indicators (R2, AIC, Bias) 4. Compare alternative distributional assumptions. Is distribution-free or beta-binomial more appropriate? 5. Document model selection rationale # References Jones, M. C., & Pewsey, A. (2009). Sinh-arcsinh distributions. *Biometrika*, 96(4), 761-780. Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. *PLoS ONE*, 14(9), e0222279. https://doi.org/10.1371/journal.pone.0222279 Lenhard, W., Lenhard, A., & Gary, S. (2025). cNORM: Continuous norming. R package version 3.5.0. https://CRAN.R-project.org/package=cNORM