Modelling Psychometric Data with Sinh-Arcsinh Distributions

Wolfgang Lenhard & Alexandra Lenhard

2025-09-27

Mathematical Foundation

The Sinh-Arcsinh Transformation

The ShaSh distribution is defined by a transformation of a standard normal variable. If \(Y\) follows a standard normal distribution, \(Y \sim N(0,1)\), then the ShaSh-distributed variable \(X\) is generated by:

\[X = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(Y) - \epsilon}{\delta}\right)\]

This transformation allows the resulting variable \(X\) to have its location, scale, skewness, and tail weight controlled by the four parameters:

The probability density function involves:

\[f(x|\mu,\sigma,\epsilon,\delta) = \frac{\delta}{\sigma\sqrt{2\pi}} \cdot \frac{\cosh(\delta \cdot \text{arcsinh}(z) + \epsilon)}{\sqrt{1 + z^2}} \cdot \exp\left(-\frac{1}{2}[\sinh(\delta \cdot \text{arcsinh}(z) + \epsilon)]^2\right)\]

where \(z = \frac{x - \mu}{\sigma}\).

The cumulative distribution function (CDF) does not have a closed-form expression but can be computed numerically. For a given value \(x\), the CDF is:

\[F(x|\mu,\sigma,\epsilon,\delta) = P(X \leq x) = \Phi[\sinh(\delta \cdot \text{arcsinh}(z) + \epsilon)]\]

where \(\Phi\) is the standard normal CDF and \(z = (x - \mu)/\sigma\). The quantile function (inverse CDF) can be expressed as:

\[Q(p|\mu,\sigma,\epsilon,\delta) = \mu + \sigma \cdot \sinh\left(\frac{\text{arcsinh}(\Phi^{-1}(p)) - \epsilon}{\delta}\right)\]

Modeling the Sinh-Arcsinh over Age

In cNORM’s ShaSh implementation, we model the four parameters as polynomial functions of standardized age. The tail weight parameter δ is held constant across ages in the default setting. It can be adjusted to reflect population characteristics, e. g. by increasing it to values delta > 1 for heterogenous samples or delta < 1 for homogenuous samples. By setting the delta_degree parameter, δ is as well modeled as a polynomial across age. If used, it is advisable to keep the delta_degree parameter low, for example 1 or 2, to avoid overfitting. Age is standardized as: \(\text{age}_{std} = \frac{\text{age} - \overline{\text{age}}}{\text{SD}(\text{age})}\) for numerical stability during optimization.

Parameters are estimated using maximum likelihood estimation.

Advantages and Applications

Key Advantages

1. Distributional Flexibility: The ShaSh distribution can model symmetric and asymmetric distributions with varying tail weights, making it suitable for diverse psychometric applications.

2. Continuous Scores: Unlike discrete distributions (e.g., beta-binomial), ShaSh naturally handles continuous scores, decimal values, and unbounded measures.

3. Zero and Negative Values: ShaSh can accommodate scores of zero and negative values without transformation, unlike Box-Cox approaches that require strictly positive data.

4. Independent Parameter Control: Skewness (ε) and tail weight (δ) can be controlled independently, allowing precise modeling of distributional characteristics.

Optimal Applications

The ShaSh distribution is particularly well-suited for:

Prerequisites and Considerations

Data Requirements

Systematic Development: Test scores should exhibit systematic (though not necessarily monotonic) development across the predictor variable. The ShaSh model can capture complex non-linear relationships through polynomial terms.

Sufficient Sample Size: We recommend a minimum of 100 cases per major age group, with larger samples needed for higher polynomial degrees or complex developmental patterns. While empirical validation of this rule is ongoing, these sample sizes have proven effective in previous norming studies (Lenhard et al., 2019).

Score Characteristics: While ShaSh handles continuous scores optimally, it can also model discrete scores effectively, especially when the number of possible values is large (> 20).

Representativeness: The sample should be representative of the target population, though post-stratification weighting can help address moderate deviations from representativeness.

Model Selection Considerations

Polynomial Degrees: Choose polynomial degrees based on:

Delta Parameter Selection: The tail weight parameter should reflect population characteristics:

If the delta_degree parameter is used, keep it low (1 or 2) to avoid overfitting. In that case, the default δ is not used.

Convergence Considerations: Complex models may require adjusted optimization parameters. The function includes automatic fallback strategies for difficult convergence cases.

Modeling Example

We demonstrate ShaSh modeling using the PPVT-4 vocabulary development dataset, which showcases the distribution’s ability to handle complex developmental patterns and distributional characteristics typical of psychometric data. The code examples are provided for reference but are not executed in this vignette due to computational intensity.

Data Exploration

library(cNORM)

# Examine the data structure and distribution
str(ppvt)
#> 'data.frame':    4542 obs. of  6 variables:
#>  $ age      : num  2.6 2.6 2.62 2.86 2.88 ...
#>  $ sex      : num  1 1 1 1 1 1 1 1 1 1 ...
#>  $ migration: num  0 0 0 0 0 0 0 0 0 0 ...
#>  $ region   : chr  "west" "west" "west" "south" ...
#>  $ raw      : num  120 67 23 50 44 55 69 38 61 72 ...
#>  $ group    : num  3.16 3.16 3.16 3.16 3.16 ...
plot(ppvt$age, ppvt$raw, main="PPVT Raw Scores by Age", 
     xlab="Age", ylab="Raw Score", pch=16, col=rgb(0,0,0,0.3))


# Examine distributional characteristics
hist(ppvt$raw, breaks=30, main="Distribution of Raw Scores", 
     xlab="Raw Score", probability=TRUE, col="lightblue")

The vocabulary development data exhibits characteristic curvilinear growth patterns with potential distributional changes across age groups. The raw score distribution may show skewness or varying tail weights, making it an ideal candidate for ShaSh modeling.

Basic Model Fitting

# Fit ShaSh model with default settings
# Default: mu_degree=3, sigma_degree=2, epsilon_degree=2, delta=1
model.shash <- cnorm.shash(age = ppvt$age, score = ppvt$raw)


# The function automatically displays percentile plots
print(model.shash)
#> $mu_est
#> [1] 187.400553  26.220032  -9.829765   1.575112
#> 
#> $sigma_est
#> [1]  2.78306883 -0.26105719 -0.02437886
#> 
#> $epsilon_est
#> [1] 0.6796160 0.2457028
#> 
#> $delta_est
#> NULL
#> 
#> $delta
#> [1] 1
#> 
#> $se
#> [1] 0.55998247 0.68059010 0.24621647 0.19370018 0.01853546 0.01489224 0.01110374
#> [8] 0.02367007 0.02277379
#> 
#> $mu_degree
#> [1] 3
#> 
#> $sigma_degree
#> [1] 2
#> 
#> $epsilon_degree
#> [1] 1
#> 
#> $delta_degree
#> NULL
#> 
#> $result
#> $result$par
#> [1] 187.40055296  26.22003216  -9.82976519   1.57511216   2.78306883
#> [6]  -0.26105719  -0.02437886   0.67961600   0.24570280
#> 
#> $result$value
#> [1] 19670.99
#> 
#> $result$counts
#> function gradient 
#>      141      141 
#> 
#> $result$convergence
#> [1] 0
#> 
#> $result$message
#> [1] "CONVERGENCE: REL_REDUCTION_OF_F <= FACTR*EPSMCH"
#> 
#> $result$hessian
#>              [,1]        [,2]        [,3]        [,4]        [,5]         [,6]
#>  [1,]   23.817309   12.107102   24.737912   19.067015   22.109318    -2.668045
#>  [2,]   12.107102   24.737914   19.067015   49.592069   -2.668051    15.603729
#>  [3,]   24.737912   19.067015   49.592069   35.465985   15.603701    -4.973938
#>  [4,]   19.067015   49.592069   35.465985  128.016394   -4.973927    26.777866
#>  [5,]   22.109318   -2.668051   15.603701   -4.973927 7993.261708  -546.933858
#>  [6,]   -2.668045   15.603729   -4.973938   26.777866 -546.933858  8088.462731
#>  [7,]   15.603834   -4.973982   26.778281  -36.524817 8088.481895 -3161.473943
#>  [8,] -398.922400  -96.699116 -392.719761 -107.000362 2685.998637   991.668459
#>  [9,]  -96.699114 -392.720176 -107.000241 -806.292850  991.668634  2175.283992
#>               [,7]        [,8]        [,9]
#>  [1,]    15.603834  -398.92240   -96.69911
#>  [2,]    -4.973982   -96.69912  -392.72018
#>  [3,]    26.778281  -392.71976  -107.00024
#>  [4,]   -36.524817  -107.00036  -806.29285
#>  [5,]  8088.481895  2685.99864   991.66863
#>  [6,] -3161.473943   991.66846  2175.28399
#>  [7,] 18015.172735  2175.28463  1594.02179
#>  [8,]  2175.284631 10516.64239   -84.08582
#>  [9,]  1594.021794   -84.08582 10421.97109
#> 
#> attr(,"age_mean")
#> [1] 10.47791
#> attr(,"age_sd")
#> [1] 3.255976
#> attr(,"ageMin")
#> [1] 2.5202
#> attr(,"ageMax")
#> [1] 16.9952
#> attr(,"score_mean")
#> [1] 162.8355
#> attr(,"score_sd")
#> [1] 38.18122
#> attr(,"max")
#> [1] 221
#> attr(,"min")
#> [1] 7
#> attr(,"N")
#> [1] 4542
#> attr(,"scaleMean")
#> [1] 50
#> attr(,"scaleSD")
#> [1] 10
#> attr(,"delta")
#> [1] 1
#> 
#> attr(,"class")
#> [1] "cnormShaSh"

The model uses reasonable default polynomial degrees and provides immediate visual feedback through percentile plots. The output includes fitted parameters, convergence information, and basic model statistics.

Model Diagnostics

# Obtain comprehensive diagnostics
summary(model.shash, age = ppvt$age, score = ppvt$raw)
#> Sinh-Arcsinh Continuous Norming Model
#> -------------------------------------
#> Polynomial degrees:
#>   Location (mu): 3 
#>   Scale (sigma): 2 
#>   Skewness (epsilon): 1 
#>   Tail weight (delta): Fixed at 1 
#> Number of observations: 4542 
#> Number of parameters: 9 
#> 
#> Model Fit:
#>   Log-likelihood: -19670.99 
#>   AIC: 39359.98 
#>   BIC: 39417.77 
#>   R-squared: 0.9775 
#>   RMSE: 1.5216 
#>   Bias: 0.3307 
#> 
#> Convergence:
#>   Converged: TRUE 
#>   Function evaluations: 141 
#>   Max gradient: 23945.01 
#>   Message: Successful convergence 
#> 
#> Parameter Estimates:
#> Location (mu) parameters:
#>      Estimate Std..Error z.value  Pr...z..
#> mu_0  187.401     0.5600 334.654 0.000e+00
#> mu_1   26.220     0.6806  38.525 0.000e+00
#> mu_2   -9.830     0.2462 -39.923 0.000e+00
#> mu_3    1.575     0.1937   8.132 4.441e-16
#> 
#> Scale (sigma) parameters:
#>              Estimate Std..Error z.value Pr...z..
#> log(sigma)_0  2.78307    0.01854 150.148  0.00000
#> log(sigma)_1 -0.26106    0.01489 -17.530  0.00000
#> log(sigma)_2 -0.02438    0.01110  -2.196  0.02812
#> 
#> Skewness (epsilon) parameters:
#>           Estimate Std..Error z.value Pr...z..
#> epsilon_0   0.6796    0.02367   28.71        0
#> epsilon_1   0.2457    0.02277   10.79        0

The summary provides:

Key diagnostic indicators:

Custom Model Specifications

For datasets with complex patterns, adjust polynomial degrees and distributional parameters:

# Example with more complex parameterization (not executed due to computation time)
model.custom <- cnorm.shash(age = ppvt$age, score = ppvt$raw,
                           mu_degree = 3,        # Curvelinear pattern
                           sigma_degree = 3,     # Complex variability changes  
                           epsilon_degree = 2,   # Age-varying skewness
                           delta_degree = 2)     # Changing tail weights across age


# Compare models
compare(model.shash, model.custom, age = ppvt$age, score = ppvt$raw,
        title = "ShaSh Model Comparison")
#> 
#> Model Comparison Summary:
#> ------------------------
#>  Metric     Model1     Model2 Difference
#>      R2     0.9567     0.9603     0.0037
#>    Bias     0.3129     0.0542    -0.2587
#>    RMSE     2.0976     1.9885    -0.1091
#>     MAD     1.5190     1.4112    -0.1078
#>     AIC 39359.9772 39274.0828   -85.8944
#>     BIC 39417.7673 39363.9785   -53.7888
#> 
#> Note: Difference = Model2 - Model1
#>       Fit indices are based on the manifest and fitted norm scores of both models.
#>       Scale metrics are T scores (scaleSD = 10)
#>       AIC and BIC should only be used when comparing models of the same type.

Note: Higher polynomial degrees increase model flexibility but may lead to overfitting with insufficient data. Always validate complex models through visual inspection and cross-validation.

Simplified Alternative

# More conservative parameterization for demonstration
model.simple <- cnorm.shash(age = ppvt$age, score = ppvt$raw,
                           mu_degree = 2,        # Quadratic location pattern
                           sigma_degree = 1,     # Linear variability change  
                           epsilon_degree = 1,   # Linear skewness change
                           delta = 1.1)          # Slightly heavy tails


# This model balances flexibility with stability

Post-Stratification and Weighting

ShaSh models support weighted estimation for representative sampling:

# Calculate post-stratification weights
margins <- data.frame(variables = c("sex", "sex", "migration", "migration"),
                     levels = c(1, 2, 0, 1),
                     share = c(.52, .48, .7, .3))

weights <- computeWeights(ppvt, margins)
#> Raking converged normally after 3 iterations.

# Fit weighted ShaSh model
model.weighted <- cnorm.shash(ppvt$age, ppvt$raw, weights = weights)


# Compare weighted vs. unweighted
compare(model.shash, model.weighted, age = ppvt$age, score = ppvt$raw,
        title = "Unweighted vs. Weighted ShaSh Models")
#> 
#> Model Comparison Summary:
#> ------------------------
#>  Metric     Model1     Model2 Difference
#>      R2     0.9567     0.9566    -0.0001
#>    Bias     0.3129     0.4622     0.1493
#>    RMSE     2.0976     2.1262     0.0287
#>     MAD     1.5190     1.5541     0.0351
#>     AIC 39359.9772 41352.8375  1992.8603
#>     BIC 39417.7673 41410.6276  1992.8603
#> 
#> Note: Difference = Model2 - Model1
#>       Fit indices are based on the manifest and fitted norm scores of both models.
#>       Scale metrics are T scores (scaleSD = 10)
#>       AIC and BIC should only be used when comparing models of the same type.

Norm Score Generation

Individual Predictions

# Generate norm scores for specific age-score combinations
ages <- c(10.25, 10.75, 11.25, 11.75)
raw_scores <- c(180, 185, 190, 195)

norm_scores <- predict(model.shash, ages, raw_scores)
prediction_table <- data.frame(
  Age = ages, 
  Raw_Score = raw_scores, 
  Norm_Score = round(norm_scores, 1)
)
print(prediction_table)
#>     Age Raw_Score Norm_Score
#> 1 10.25       180       53.4
#> 2 10.75       185       54.3
#> 3 11.25       190       55.6
#> 4 11.75       195       57.5

Comprehensive Norm Tables

# Generate detailed norm tables for multiple ages
tables <- normTable.shash(model.shash, 
                         ages = c(10.25, 10.75), 
                         start = 150, 
                         end = 220,
                         step = 1,
                         CI = 0.95, 
                         reliability = 0.94)

# Display sample from first table
head(tables[[1]], 10)
#>      x         Px      Pcum Percentile          z     norm  lowerCI  upperCI
#> 1  150 0.01069892 0.1697324   16.97324 -0.9552233 40.44777 36.36624 45.67556
#> 2  151 0.01112359 0.1788393   17.88393 -0.9197974 40.80203 36.69925 46.00856
#> 3  152 0.01155717 0.1883045   18.83045 -0.8841617 41.15838 37.03422 46.34354
#> 4  153 0.01199968 0.1981352   19.81352 -0.8483008 41.51699 37.37132 46.68063
#> 5  154 0.01245117 0.2083391   20.83391 -0.8121977 41.87802 37.71068 47.02000
#> 6  155 0.01291170 0.2189236   21.89236 -0.7758338 42.24166 38.05251 47.36182
#> 7  156 0.01338142 0.2298963   22.98963 -0.7391884 42.60812 38.39697 47.70629
#> 8  157 0.01386049 0.2412650   24.12650 -0.7022393 42.97761 38.74429 48.05361
#> 9  158 0.01434915 0.2530375   25.30375 -0.6649617 43.35038 39.09470 48.40402
#> 10 159 0.01484770 0.2652220   26.52220 -0.6273283 43.72672 39.44846 48.75777
#>    lowerCI_PR upperCI_PR
#> 1    8.638209   33.27093
#> 2    9.174675   34.48936
#> 3    9.738827   35.73145
#> 4   10.331823   36.99684
#> 5   10.954874   38.28515
#> 6   11.609248   39.59601
#> 7   12.296277   40.92901
#> 8   13.017365   42.28377
#> 9   13.773995   43.65987
#> 10  14.567739   45.05694

The norm tables provide:

Model Comparison and Selection

Comparing Distributional Approaches

# Compare ShaSh with other approaches
# Beta-Binomial models should work worse, since the test has stop rules,
# leading to non-binomial distributions. Distribution-free models (Taylor)
# should be able to model the data flexibly as well.
model.bb <- cnorm.betabinomial(ppvt$age, ppvt$raw, n = 228)

model.taylor <- cnorm(group = ppvt$group, raw = ppvt$raw)
#> Powers of location: k = 5
#> Powers of age:      t = 3
#> Multiple R2 between raw score and explanatory variable: R2 = 0.6915
#> 
#> Final solution: 22 terms (highest consistent model)
#> R-Square Adj. = 0.991584
#> Final regression model: raw ~ L1 + L2 + L3 + L4 + A1 + A2 + A3 + L1A1 + L1A2 + L1A3 + L2A1 + L2A2 + L2A3 + L3A1 + L3A2 + L3A3 + L4A1 + L4A2 + L4A3 + L5A1 + L5A2 + L5A3
#> Regression function: raw ~ 320.4238239 + (-43.52313536*L1) + (1.576062212*L2) + (-0.02330857938*L3) + (0.0001252947997*L4) + (-119.8654247*A1) + (7.700634153*A2) + (-0.1530384974*A3) + (14.50741478*L1A1) + (-0.9423146812*L1A2) + (0.01974974212*L1A3) + (-0.4835093183*L2A1) + (0.03105400924*L2A2) + (-0.0006204492825*L2A3) + (0.006793980057*L3A1) + (-0.000372063774*L3A2) + (5.021380047e-06*L3A3) + (-3.069999226e-05*L4A1) + (2.100334875e-07*L4A2) + (5.678401213e-08*L4A3) + (-4.907683608e-08*L5A1) + (1.565428866e-08*L5A2) + (-7.273632324e-10*L5A3)
#> Raw Score RMSE = 3.49392
#> 
#> Use 'printSubset(model)' to get detailed information on the different solutions, 'plotPercentiles(model) to display percentile plot, plotSubset(model)' to inspect model fit.


# Visual comparisons
compare(model.shash, model.bb, age = ppvt$age, score = ppvt$raw,
        title = "ShaSh vs. Beta-Binomial")
#> 
#> Model Comparison Summary:
#> ------------------------
#>  Metric     Model1     Model2 Difference
#>      R2     0.9567     0.9368    -0.0198
#>    Bias     0.3129     0.1408    -0.1721
#>    RMSE     2.0976     2.5099     0.4123
#>     MAD     1.5190     1.8816     0.3626
#>     AIC 39359.9772 39856.6454   496.6682
#>     BIC 39417.7673 39908.0144   490.2471
#> 
#> Note: Difference = Model2 - Model1
#>       Fit indices are based on the manifest and fitted norm scores of both models.
#>       Scale metrics are T scores (scaleSD = 10)
#>       AIC and BIC should only be used when comparing models of the same type.


compare(model.taylor, model.shash, age = ppvt$age, score = ppvt$raw,
        title = "Distribution-Free vs. ShaSh")
#> Retrieving norm scores, please stand by ...
#> 
#> Model Comparison Summary:
#> ------------------------
#>  Metric      Model1     Model2 Difference
#>      R2      0.9609     0.9567    -0.0043
#>    Bias      0.0460     0.3129     0.2669
#>    RMSE      1.9787     2.0976     0.1188
#>     MAD      1.4109     1.5190     0.1081
#>     AIC  24320.8609 39359.9772 15039.1163
#>     BIC -21528.1994 39417.7673 60945.9667
#> 
#> Note: Difference = Model2 - Model1
#>       Fit indices are based on the manifest and fitted norm scores of both models.
#>       Scale metrics are T scores (scaleSD = 10)
#>       AIC and BIC should only be used when comparing models of the same type.

Decision Framework

The different continuous models have advantages for specific use cases, with the distribution free approach (Taylor polynomials) being the most flexible, the beta-binomial approach being optimal for discrete item counts, and ShaSh being ideal for continuous scores with complex distributional shapes.

Use ShaSh models for

Use Beta-Binomial models for

Use distribution-free (Taylor Polynomial) models when

In selecting models, please compare models using:

  1. Information criteria: AIC, BIC (lower is better)
  2. Fit statistics: R², RMSE, bias
  3. Visual inspection: Smoothness, realism of percentile curves
  4. Cross-validation: Out-of-sample prediction accuracy
  5. Theoretical appropriateness: Match between model assumptions and data characteristics

Computational Considerations

Performance Tips

  1. Start simple: Use default parameters initially, then increase complexity as needed
  2. Monitor convergence: Check convergence codes and adjust control parameters if necessary
  3. Validate visually: Always inspect percentile plots for realistic patterns
  4. Use appropriate sample sizes: Larger samples for higher polynomial degrees
  5. Consider data preprocessing: Remove extreme outliers that might affect convergence

Troubleshooting

Convergence Issues and Numerical Instability:

  • Reduce polynomial degrees
  • Adjust delta parameter
  • Use simpler control parameters
  • Check for data quality issues
  • Pay attention to warning messages during optimization

Overfitting Signs:

  • Wavy, unrealistic percentile curves
  • Very high polynomial degrees relative to sample size
  • Poor out-of-sample prediction

Conclusion

The Sinh-Arcsinh distribution provides a sophisticated yet practical framework for modeling psychometric norm data. Its ability to independently control location, scale, skewness, and tail weight makes it particularly valuable for applications where traditional parametric approaches are inadequate due to distributional complexity.

Best practices for ShaSh modeling:

  1. Carefully assess data characteristics before model selection
  2. Use appropriate sample sizes for model complexity
  3. Validate models through visually and via statistical indicators (R2, AIC, Bias)
  4. Compare alternative distributional assumptions. Is distribution-free or beta-binomial more appropriate?
  5. Document model selection rationale

References

Jones, M. C., & Pewsey, A. (2009). Sinh-arcsinh distributions. Biometrika, 96(4), 761-780.

Lenhard, A., Lenhard, W., Gary, S. (2019). Continuous norming of psychometric tests: A simulation study of parametric and semi-parametric approaches. PLoS ONE, 14(9), e0222279. https://doi.org/10.1371/journal.pone.0222279

Lenhard, W., Lenhard, A., & Gary, S. (2025). cNORM: Continuous norming. R package version 3.5.0. https://CRAN.R-project.org/package=cNORM