Methodological Details

Drew Dimmery (ddimmery@univie.ac.at)

Edward Kennedy (edward@stat.cmu.edu)

August 11, 2023

Abstract

In this paper, we introduce the tidyhte package for estimation of heterogeneous treatment effects (HTE) from observational or experimental data. This package implements the methods of Kennedy (2020) and presents them through a tidy-style user-facing API. The design principles undergirding this package are (1) the APIs should be tidy-friendly, (2) Analyses should be easy to replicate with minor changes, (3) specifying complex ensembles for the nuisance functions should be straightforward, and (4) sensible diagnostics should be easily accessible. Plotting and formatting of the results are left for the end-user to customize.

Summary

This document details how tidyhte constructs estimates of heterogeneous treatment effects. It will highlight a variety of the features of the package and discuss the mathematics which undergird them.

After a brief introduction to the methods of HTE estimation of Kennedy (2020), the structure will generally follow the estimation API of tidyhte: it will begin by discussing the creation of cross-validation folds, then highlight nuisance function estimation, proceed to the construction of “pseudo outcomes”, a concept from Kennedy (2020), and conclude by demonstrating the calculation of a few varieties of Quantities of Interest: the actual statistics which are desired by the end-user.

Preliminaries

Problem Setting

Our data is defined by the triple \(Z_i = (X_i, A_i, Y_i)\), with \(X \in \mathbb{R}^d\), \(Y \in \mathbb{R}\) and \(A \in \{0, 1\}\). Define the following nuisance functions: \[ \pi(x) \equiv \mathbb{P}(A = 1 \mid X = x) \\ \mu_a(x) \equiv \mathbb{E}(Y \mid X = x, A = a) \]

Heterogeneous treatment effects are defined as the difference in conditional expectations under treatment and control, \({\tau(x) \equiv \mu_1(x) - \mu_0(x)}\). Throughout, we will maintain the following assumptions:

  1. Consistency: \(Y_i = Y_i(A_i)\)

  2. No Unmeasured Confounding: \(A \perp (Y(1), Y(0)) \mid X\)

  3. Positivity: \(\epsilon \leq \pi \leq 1 - \epsilon\) with probability \(1\)

Under these assumptions, \(\tau(x) = \mathbb{E}[Y(1) - Y(0) \mid X = x]\).

For the remainder of this paper, we will use a semi-simulated running example based on the penguins dataset of Horst, Hill, and Gorman (2020). We will imagine a randomly assigned nutritional intervention for which we wish to measure the causal effect on body mass. We also have a lower variance measurement of the average food consumed per day over an observation period. Gentoo penguins gain weight by this intervention on average, while vice-versa for Adelie penguins. The average change in weight for Chinstrap penguins is zero.

penguins <- within(penguins, {
  id <- 1:n
  propensity_score <- 0.5
  treatment <- rbinom(n, 1, propensity_score)
  tau <- 0.2 * (species == "Gentoo") - 0.2 * (species == "Adelie") + rnorm(n, sd = 0.05)
  food_consumed_g <- rnorm(n, 500, 5) * (1 + tau * treatment)
  body_mass_g <- body_mass_g * (1 + tau * treatment)
})

Overview of Method

We begin by introducing the following “DR-Learner” algorithm:

DR-learner algorithm. This consists of three main steps:

  1. Given: estimates of nuisance functions trained elsewhere: \(\left(\hat{\pi}, \hat{\mu}_0, \hat{\mu}_1\right)\)
  2. Construct “pseudo-outcome: This quantity is a transformation of the provided nuisance function estimators. \[ \hat{\psi}(Z) = \frac{A - \hat{\pi}(X)}{\hat{\pi}(X)(1 - \hat{\pi}(X))} \left( Y - \hat{\mu}_{A}(X) \right) + \hat{\mu}_{1}(X) - \hat{\mu}_{0}(X) \]
  3. Second-stage regression: Construct a smoothing model over the transformed pseudo-outcome. \[ \hat{\tau}_{dr}(x) = \hat{\mathbb{E}}[\hat{\psi}(Z) \mid X = x] \]

This algorithm is written assuming estimates of nuisance functions are trained on separate data and, therefore, can be treated as fixed. The easiest way to do this is with a sample-splitting procedure, after which results are averaged across splits.

The crucial result, Theorem 2 of Kennedy (2020), shows that the error of the second stage regression will match the error of an oracle regression of the true individual treatment effects on covariates \(X\) up to a factor which depends on the product of the errors of the two estimated nuisance functions (that is, the errors in \(\pi\) and in \(\mu\)).

When the second-stage regression is simple, like subgroup averages within cells defined by \(X\), the DR-Learner inherits unbiasedness and efficiency like an AIPW estimator (Robins, Rotnitzky, and Zhao 1995; Van der Laan, Laan, and Robins 2003; Tsiatis 2006; Tsiatis et al. 2008; Chernozhukov et al. 2018).

The approximation results of Kennedy (2020) applied to regressions on this transformed outcome allow for a lot of flexibility in quantities of interest that may be estimated. We will use this fact to allow the estimation of a variety of models as if they were estimated on the individual causal effects themselves: for example, variable importance measures operate off the assumption that various second-stage regressions will be accurate.

Principles

The tidyhte package is premised on the idea of breaking up the analysis of HTEs into a few distinct parts, which can then be mixed together as desired. The first step is to define a configuration object (or recipe) describing at a high-level how HTE estimation should be performed. During estimation, the specific variables of interest are indicated. This design allows for repeating very similar analyses multiple times with very little overhead. Instances when this might be useful are when the user wishes to explore heterogeneity across a variety of outcomes, or when there are a variety of treatment contrasts of particular interest. Each of these analyses will tend to share common features: similar classes of models will be included in the ensembles for nuisance estimation, for example.

Also important is that the methods provided support the usage of common

Clustered data

A common feature of real-world data is that treatment may be clustered. In other words, if one unit receives treatment, there may be other units who are then also more likely to receive treatment. A common example of this sort of design might be when villages or townships are assigned to treatment, but measurement occurs at the individual level. As discussed by Abadie et al. (2023), this design implies that standard errors should be clustered at the level at which treatment was assigned. The tidyhte package supports clustering as a first class citizen, and resulting estimates will all receive the proper statistical uncertainty estimates based around this clustering. In practice, the end-user simply specifies the individual unit-id when constructing cross-validation splits (make_splits()), and all subsequent analyses take this into account.

Population weights

Another common feature of designs used in practice is that they come from samples that do not perfectly represent the larger populations from which they are drawn. The most common solution to this problem is through the use of weights to make the sample more closely resemble the population. This, too, is supported by tidyhte, by simply specifying weights when models are estimated (produce_plugin_estimates()). Downstream analyses will then take weights into account appropriately.

Recipe API

In order to build up definitions around how HTE should be estimated, tidyhte provides an API to progressively build up a configuration object. A basic_config function creates a bare-bones configuration consisting of only linear models for the respective nuisance functions and a number of diagnostics.

cfg <- basic_config() %>%
    add_known_propensity_score("propensity_score") %>%
    add_outcome_model("SL.glmnet", alpha = c(0.0, 1.0)) %>%
    add_moderator("Stratified", species, island, sex, year) %>%
    add_moderator("KernelSmooth", bill_length_mm, bill_depth_mm, flipper_length_mm) %>%
    add_vimp(sample_splitting = FALSE)

Since the subject of interest is an experimental intervention, the propensity score is known. Using this known propensity score provides unbiasedness for many quantities of interest (although this may leave some efficiency on the table CITE). In this case, in addition to the (default) linear model included in the SuperLearner ensemble, we add an elastic-net regression (Zou and Hastie 2005). We sweep over a variety of mixing parameters between LASSO and ridge, throwing each of these models into the ensemble. This means that SuperLearner will perform model selection and averaging to identify the best hyper-parameter values (Van der Laan, Polley, and Hubbard 2007). Furthermore, we define all of the moderators of interest and how their results should be collected and displayed. Discrete moderators will just take stratified averages at each level of the moderator, while continuous moderators will use local polynomial regression (Fan and Gijbels 2018; Calonico, Cattaneo, and Farrell 2019). Finally, we add a variable importance measure from Williamson et al. (2021).

After the configuration is completed, it can be attached to the dataset.

penguins %<>% attach_config(cfg)

Cross-validation

The first step in an analysis of heterogeneous treatment effects following this procedure is to define how to construct splits to be used for cross-validation. tidyhte accomplishes this by using blocking methods to construct lower-variance splits than a purely randomized splitting procedure would entail. Existing methods for generating splits often provide options for stratifying based on a binary outcome to ensure there is variance in the outcome in all splits, even when the outcome is very sparse. For instance, SuperLearner::SuperLearner.CV.control provides such an option.

The appropriate function in tidyhte is make_splits. This function takes in a dataframe and determines a set of splits which accord with the provided unit identifier and number of splits. If any covariates are provided to this function, it will include them in a blocking design for constructing randomized splits using the methods of Higgins, Sävje, and Sekhon (2016) as implemented in the quickblock package.

There are a few relevant methodological notes about how this blocking for cross-validation strata works. In short, blocks are constructed which are sized to be at least as large as the number of splits to use. These blocks are constructed to minimize the within-block distance between units (where distances are Euclidean based on the provided covariates). When more than one row shares an identifier, blocking is performed on the average covariate value within each identifier. Rows with the same identifier are then assigned to the same split. Precise details on the construction of blocks may be found in higgins2016improving. Within each block, a vector of split IDs is constructed which has a marginal distribution as close to the uniform distribution over splits as possible (up to integer division errors). This set of split IDs is then randomly permuted within blocks.

Consistent with tidy semantics, the original dataframe is returned, but with the addition of a column .split_id representing these newly constructed splits, and a few attributes for bookkeeping (e.g. the column name of the identifier). Since the object returned is the same as what was passed in, this makes for easy chaining of commands using dplyr.

penguins %<>% make_splits(id, species, sex, flipper_length_mm, .num_splits = 3)
## `num_splits` must be even if VIMP is requested as a QoI. Rounding up.
## Dropped 11 of 344 rows (3.2%) through listwise deletion.

Note that tidyhte gracefully handles missing data via listwise deletion. More advanced imputation methods are not yet supported.

Nuisance function estimation

Estimation of nuisance functions such as the propensity score and the outcome regression are typically handled by the SuperLearner library. Specifying a full array of models with diverse hyperparameters is much simplified through the tidyhte API. To specify a cross-validated learner using SuperLearner syntax requires substantially more boilerplate:

learners <- create.Learner(
    "SL.glmnet",
    tune = list(
        alpha = c(0.05, 0.15, 0.2, 0.25, 0.5, 0.75)
    ),
    detailed_names = TRUE,
    name_prefix = paste0("SLglmnet")
)

CV.SuperLearner(label, covariates, SL.library = learners$names)

In contrast, the tidyhte Recipe API requires only the following one line:

add_outcome_model(cfg, "SL.glmnet", alpha = c(0.0, 0.25, 0.5, 0.75, 1.0))
penguins %<>% produce_plugin_estimates(
  # outcome
  food_consumed_g,
  # treatment
  treatment,
  # covariates
  species, island, sex, year, bill_length_mm, bill_depth_mm, flipper_length_mm
)

Pseudo-outcome construction

Once nuisance functions are estimated, it is simply a matter of combining these results together into the appropriate pseudo-outcome. For typical HTE estimation, the pseudo-outcome of interest is the one analyzed by Kennedy (2020): the uncentered influence function of the average treatment effect (Robins, Rotnitzky, and Zhao 1995).

penguins %<>% construct_pseudo_outcomes(food_consumed_g, treatment)

Quantities of interest

Finally, it comes to the most glamorous part of the analysis, when effects are estimated and put into charts.

The design of tidyhte chooses to leave the charting to the end-user, but merely returns a tidy tibble with all of the requested quantities of interest. The package focuses on a few types of quantities of interest: - Marginal Conditional Average Treatment Effects (MCATEs): the standard Conditional Average Treatment Effects (CATEs) in the literature, in which all covariates except one are marginalized over, providing one average effect for one level of a single covariate. This is in contrast to a “Partial” CATE, in which other variables are “controlled for” in some way. This does not provide a satisfying causal interpretation without assumptions of joint randomization of treatment and covariates. - Variable Importance (VIMP): Using Williamson et al. (2021), tidyhte calculates how much each moderator contributes to the overall reduction in mean-squared-error in a joint model of the heterogeneous effects. - Diagnostics: A wide variety of diagnostics are provided for all models fit as part of the tidyhte estimation process. These include single-number summaries like mean-squared-error or AUC, as well as entire receiver operating characteristic curves and coefficients in the SuperLearner ensembles.

penguins %>%
  estimate_QoI(species, island, sex, year, bill_length_mm, bill_depth_mm, flipper_length_mm) ->
  results

The resulting tibble provides looks like the following:

results
estimand term value level estimate std_error
MCATE species NA Adelie -97.9415143 3.0561070
MCATE species NA Gentoo 100.5181448 3.8728078
MCATE species NA Chinstrap -7.7032675 5.3660617
MCATE island NA Torgersen -94.2397942 6.1442484
MCATE island NA Biscoe 47.1404445 7.5864118
MCATE island NA Dream -49.7255787 5.5443517
MCATE sex NA male -7.3545592 7.7672963
MCATE sex NA female -9.8548976 7.2661299
MCATE year 2007.00000 NA -13.0834311 9.3562626
MCATE year 2008.00000 NA 0.8093520 9.4122659
MCATE year 2009.00000 NA -13.7221148 8.8616791
MCATE bill_length_mm 35.70000 NA -102.0063564 4.2473998
MCATE bill_length_mm 35.90000 NA -103.0133270 4.2558385
MCATE bill_length_mm 36.00000 NA -103.5934897 4.2506636
MCATE bill_length_mm 36.20000 NA -104.9633407 4.2339816
MCATE bill_length_mm 36.36727 NA -106.3137652 4.2111705
MCATE bill_length_mm 36.50000 NA -107.3229154 4.1970871
MCATE bill_length_mm 36.67091 NA -108.5268490 4.1815606
MCATE bill_length_mm 36.87273 NA -109.7166849 4.1546229
MCATE bill_length_mm 37.14909 NA -110.9699714 4.1511282
MCATE bill_length_mm 37.30000 NA -111.6099034 4.1611356
MCATE bill_length_mm 37.57818 NA -112.3456749 4.2187917
MCATE bill_length_mm 37.68000 NA -112.4731974 4.2329803
MCATE bill_length_mm 37.78182 NA -112.5202435 4.2375906
MCATE bill_length_mm 37.88364 NA -112.4249706 4.2149106
MCATE bill_length_mm 38.10000 NA -111.5618580 4.1497950
MCATE bill_length_mm 38.18727 NA -110.9773686 4.1340858
MCATE bill_length_mm 38.47818 NA -107.9735054 4.1680952
MCATE bill_length_mm 38.60000 NA -106.4803410 4.2290051
MCATE bill_length_mm 38.80000 NA -103.6687512 4.3771672
MCATE bill_length_mm 38.90000 NA -102.1140621 4.4680725
MCATE bill_length_mm 39.00000 NA -100.5401521 4.5694343
MCATE bill_length_mm 39.20000 NA -97.1480875 4.7961557
MCATE bill_length_mm 39.50000 NA -91.4679168 5.1801863
MCATE bill_length_mm 39.60000 NA -89.4995905 5.3095643
MCATE bill_length_mm 39.60000 NA -89.4995905 5.3095643
MCATE bill_length_mm 39.70000 NA -87.5134859 5.4404094
MCATE bill_length_mm 39.82182 NA -85.1082728 5.6007930
MCATE bill_length_mm 40.20000 NA -77.5678555 6.0967366
MCATE bill_length_mm 40.32182 NA -74.9691696 6.2540350
MCATE bill_length_mm 40.60000 NA -68.9180993 6.6023480
MCATE bill_length_mm 40.61455 NA -68.6003029 6.6200244
MCATE bill_length_mm 40.81636 NA -64.0660653 6.8637675
MCATE bill_length_mm 40.90000 NA -62.0838843 6.9595031
MCATE bill_length_mm 41.10000 NA -57.2707111 7.1781470
MCATE bill_length_mm 41.10000 NA -57.2707111 7.1781470
MCATE bill_length_mm 41.14727 NA -56.1011732 7.2282529
MCATE bill_length_mm 41.40000 NA -49.4796438 7.4792577
MCATE bill_length_mm 41.52727 NA -46.0923173 7.5943502
MCATE bill_length_mm 41.85818 NA -36.7077670 7.8809443
MCATE bill_length_mm 42.13091 NA -28.2729562 8.0530918
MCATE bill_length_mm 42.33273 NA -21.8632391 8.1613610
MCATE bill_length_mm 42.50000 NA -16.5311070 8.2344696
MCATE bill_length_mm 42.70000 NA -10.2308829 8.3004666
MCATE bill_length_mm 42.83818 NA -5.8732914 8.3352531
MCATE bill_length_mm 43.14000 NA 3.6058927 8.3851607
MCATE bill_length_mm 43.20000 NA 5.4631935 8.3851087
MCATE bill_length_mm 43.34364 NA 9.8352911 8.3909698
MCATE bill_length_mm 43.50000 NA 14.3300407 8.3733437
MCATE bill_length_mm 43.89455 NA 25.0915633 8.2963429
MCATE bill_length_mm 44.24727 NA 33.8632160 8.1743461
MCATE bill_length_mm 44.90000 NA 48.5259865 7.9052453
MCATE bill_length_mm 45.10000 NA 52.4517246 7.8027190
MCATE bill_length_mm 45.20000 NA 54.2015689 7.7477754
MCATE bill_length_mm 45.20000 NA 54.2015689 7.7477754
MCATE bill_length_mm 45.30000 NA 55.8043478 7.6903881
MCATE bill_length_mm 45.46000 NA 58.2673599 7.5976196
MCATE bill_length_mm 45.50000 NA 58.8031406 7.5714864
MCATE bill_length_mm 45.60000 NA 60.1190039 7.5072942
MCATE bill_length_mm 45.70000 NA 61.3901685 7.4437672
MCATE bill_length_mm 45.80000 NA 62.6201884 7.3822482
MCATE bill_length_mm 46.00000 NA 64.9274895 7.2651164
MCATE bill_length_mm 46.10000 NA 65.9037198 7.2045195
MCATE bill_length_mm 46.20000 NA 66.7591937 7.1405865
MCATE bill_length_mm 46.37455 NA 67.9944719 7.0284638
MCATE bill_length_mm 46.40000 NA 68.1552837 7.0121197
MCATE bill_length_mm 46.50000 NA 68.7479865 6.9500947
MCATE bill_length_mm 46.58000 NA 69.1746966 6.9021877
MCATE bill_length_mm 46.70000 NA 69.7393389 6.8313676
MCATE bill_length_mm 46.80000 NA 70.1420457 6.7770635
MCATE bill_length_mm 46.90000 NA 70.4570442 6.7239461
MCATE bill_length_mm 47.20000 NA 70.5596580 6.5347320
MCATE bill_length_mm 47.48909 NA 69.9812380 6.3788342
MCATE bill_length_mm 47.50000 NA 69.9458504 6.3740239
MCATE bill_length_mm 47.69273 NA 69.1767438 6.2960900
MCATE bill_length_mm 48.10000 NA 66.2362301 6.1928739
MCATE bill_length_mm 48.39273 NA 63.4575917 6.1970380
MCATE bill_length_mm 48.49818 NA 62.3832565 6.1962552
MCATE bill_length_mm 48.60000 NA 61.3996046 6.1766237
MCATE bill_length_mm 48.70182 NA 60.3681478 6.1667966
MCATE bill_length_mm 49.00000 NA 57.8697070 6.1627220
MCATE bill_length_mm 49.10000 NA 57.1095662 6.1748445
MCATE bill_length_mm 49.20727 NA 56.2182715 6.1966145
MCATE bill_length_mm 49.40909 NA 54.3987265 6.2687400
MCATE bill_length_mm 49.51091 NA 53.4325972 6.3156484
MCATE bill_length_mm 49.61273 NA 52.5339040 6.3677542
MCATE bill_length_mm 49.80000 NA 51.0363197 6.4814252
MCATE bill_length_mm 50.00000 NA 49.6405255 6.6244105
MCATE bill_length_mm 50.00000 NA 49.6405255 6.6244105
MCATE bill_length_mm 50.12000 NA 48.8424508 6.7218738
MCATE bill_length_mm 50.22182 NA 48.2750364 6.8087798
MCATE bill_length_mm 50.42364 NA 47.2867964 6.9932255
MCATE bill_length_mm 50.50000 NA 46.9641062 7.0719410
MCATE bill_length_mm 50.62727 NA 46.6132903 7.2012879
MCATE bill_length_mm 50.80000 NA 46.2752339 7.3698673
MCATE bill_length_mm 50.83091 NA 46.2072159 7.4010443
MCATE bill_length_mm 51.03273 NA 45.6060840 7.5961889
MCATE bill_length_mm 51.30000 NA 44.6420272 7.8855223
MCATE bill_length_mm 51.33636 NA 44.5047439 7.9289759
MCATE bill_length_mm 51.57636 NA 43.6182835 8.2313378
MCATE bill_length_mm 52.00000 NA 41.9449584 8.8908317
MCATE bill_depth_mm 13.90000 NA 101.8944742 6.8786009
MCATE bill_depth_mm 13.96182 NA 100.7094090 6.4222216
MCATE bill_depth_mm 14.10000 NA 98.7261231 5.5878417
MCATE bill_depth_mm 14.20000 NA 98.2643515 5.2220581
MCATE bill_depth_mm 14.20000 NA 98.2643515 5.2220581
MCATE bill_depth_mm 14.30000 NA 98.5851283 5.0524548
MCATE bill_depth_mm 14.40000 NA 98.5075465 5.0402494
MCATE bill_depth_mm 14.47273 NA 97.9910093 5.0851201
MCATE bill_depth_mm 14.50000 NA 97.8590587 5.1098947
MCATE bill_depth_mm 14.50000 NA 97.8590587 5.1098947
MCATE bill_depth_mm 14.60000 NA 97.3989844 5.2710786
MCATE bill_depth_mm 14.60000 NA 97.3989844 5.2710786
MCATE bill_depth_mm 14.78182 NA 97.3932627 5.6956425
MCATE bill_depth_mm 14.88364 NA 97.2322531 6.0457691
MCATE bill_depth_mm 15.00000 NA 96.8744104 6.5357173
MCATE bill_depth_mm 15.00000 NA 96.8744104 6.5357173
MCATE bill_depth_mm 15.00000 NA 96.8744104 6.5357173
MCATE bill_depth_mm 15.10000 NA 95.9502885 6.9842116
MCATE bill_depth_mm 15.20000 NA 94.0838663 7.4646466
MCATE bill_depth_mm 15.29455 NA 91.8512634 7.9469113
MCATE bill_depth_mm 15.30000 NA 91.7155893 7.9741461
MCATE bill_depth_mm 15.49818 NA 83.2732152 8.9322189
MCATE bill_depth_mm 15.60000 NA 77.7675920 9.4454746
MCATE bill_depth_mm 15.70000 NA 71.7233349 9.8333575
MCATE bill_depth_mm 15.70364 NA 71.4938061 9.8452205
MCATE bill_depth_mm 15.80000 NA 65.1681931 10.1147923
MCATE bill_depth_mm 15.90000 NA 57.7745146 10.2630621
MCATE bill_depth_mm 15.90909 NA 57.0046706 10.2637416
MCATE bill_depth_mm 16.00000 NA 49.0479209 10.2512130
MCATE bill_depth_mm 16.10000 NA 39.8082575 10.2201013
MCATE bill_depth_mm 16.10000 NA 39.8082575 10.2201013
MCATE bill_depth_mm 16.20000 NA 30.6729072 10.1865164
MCATE bill_depth_mm 16.30000 NA 20.9779160 10.1512840
MCATE bill_depth_mm 16.40000 NA 11.2208825 10.1567964
MCATE bill_depth_mm 16.50000 NA 2.0758450 10.1252254
MCATE bill_depth_mm 16.60000 NA -6.2483116 9.9799859
MCATE bill_depth_mm 16.60000 NA -6.2483116 9.9799859
MCATE bill_depth_mm 16.70000 NA -14.0326562 9.7334410
MCATE bill_depth_mm 16.80000 NA -21.4557542 9.4763748
MCATE bill_depth_mm 16.90000 NA -29.2083413 9.2823956
MCATE bill_depth_mm 17.00000 NA -36.6550664 9.0627659
MCATE bill_depth_mm 17.00000 NA -36.6550664 9.0627659
MCATE bill_depth_mm 17.00000 NA -36.6550664 9.0627659
MCATE bill_depth_mm 17.00000 NA -36.6550664 9.0627659
MCATE bill_depth_mm 17.10000 NA -43.4992777 8.7962840
MCATE bill_depth_mm 17.10000 NA -43.4992777 8.7962840
MCATE bill_depth_mm 17.20000 NA -49.7703286 8.5159315
MCATE bill_depth_mm 17.20000 NA -49.7703286 8.5159315
MCATE bill_depth_mm 17.30000 NA -55.1542093 8.2394844
MCATE bill_depth_mm 17.30000 NA -55.1542093 8.2394844
MCATE bill_depth_mm 17.30000 NA -55.1542093 8.2394844
MCATE bill_depth_mm 17.50000 NA -63.6499321 7.7016376
MCATE bill_depth_mm 17.50000 NA -63.6499321 7.7016376
MCATE bill_depth_mm 17.60000 NA -66.9080906 7.3931727
MCATE bill_depth_mm 17.65818 NA -68.3953470 7.1978093
MCATE bill_depth_mm 17.80000 NA -71.2297098 6.7525862
MCATE bill_depth_mm 17.80000 NA -71.2297098 6.7525862
MCATE bill_depth_mm 17.80000 NA -71.2297098 6.7525862
MCATE bill_depth_mm 17.90000 NA -73.0680394 6.4917060
MCATE bill_depth_mm 17.90000 NA -73.0680394 6.4917060
MCATE bill_depth_mm 17.90000 NA -73.0680394 6.4917060
MCATE bill_depth_mm 17.97091 NA -74.1294986 6.3392145
MCATE bill_depth_mm 18.00000 NA -74.5465801 6.2804741
MCATE bill_depth_mm 18.10000 NA -75.2374764 6.1081316
MCATE bill_depth_mm 18.10000 NA -75.2374764 6.1081316
MCATE bill_depth_mm 18.10000 NA -75.2374764 6.1081316
MCATE bill_depth_mm 18.20000 NA -74.9364048 5.9963511
MCATE bill_depth_mm 18.28182 NA -74.8094101 5.9710127
MCATE bill_depth_mm 18.30000 NA -74.8038134 5.9730130
MCATE bill_depth_mm 18.40000 NA -74.5950645 6.0100068
MCATE bill_depth_mm 18.48727 NA -74.1710918 6.0577994
MCATE bill_depth_mm 18.50000 NA -74.1044372 6.0689742
MCATE bill_depth_mm 18.50000 NA -74.1044372 6.0689742
MCATE bill_depth_mm 18.50000 NA -74.1044372 6.0689742
MCATE bill_depth_mm 18.60000 NA -73.4470790 6.1282152
MCATE bill_depth_mm 18.60000 NA -73.4470790 6.1282152
MCATE bill_depth_mm 18.60000 NA -73.4470790 6.1282152
MCATE bill_depth_mm 18.70000 NA -72.4522092 6.1929807
MCATE bill_depth_mm 18.70000 NA -72.4522092 6.1929807
MCATE bill_depth_mm 18.80000 NA -71.0839765 6.2801105
MCATE bill_depth_mm 18.80000 NA -71.0839765 6.2801105
MCATE bill_depth_mm 18.90000 NA -69.2611528 6.4516576
MCATE bill_depth_mm 18.90000 NA -69.2611528 6.4516576
MCATE bill_depth_mm 18.90000 NA -69.2611528 6.4516576
MCATE bill_depth_mm 19.00000 NA -67.3656976 6.6862296
MCATE bill_depth_mm 19.00000 NA -67.3656976 6.6862296
MCATE bill_depth_mm 19.00000 NA -67.3656976 6.6862296
MCATE bill_depth_mm 19.10000 NA -65.0548168 6.9403152
MCATE bill_depth_mm 19.12000 NA -64.5145960 7.0031799
MCATE bill_depth_mm 19.20000 NA -61.5414510 7.2275968
MCATE bill_depth_mm 19.32364 NA -58.0670547 7.6110918
MCATE bill_depth_mm 19.40000 NA -56.4011066 7.8925412
MCATE bill_depth_mm 19.50000 NA -54.5813374 8.2723813
MCATE bill_depth_mm 19.50000 NA -54.5813374 8.2723813
MCATE bill_depth_mm 19.60000 NA -53.0095797 8.6510007
MCATE bill_depth_mm 19.70000 NA -51.2247562 9.1187680
MCATE bill_depth_mm 19.80000 NA -49.4577651 9.6430671
MCATE bill_depth_mm 19.90000 NA -48.6114966 10.0790566
MCATE bill_depth_mm 20.00000 NA -47.9638961 10.4599444
MCATE bill_depth_mm 20.00000 NA -47.9638961 10.4599444
MCATE flipper_length_mm 181.00000 NA -84.9946355 5.1659616
MCATE flipper_length_mm 182.00000 NA -85.3571987 4.9708868
MCATE flipper_length_mm 183.63636 NA -86.6855062 4.8657833
MCATE flipper_length_mm 184.00000 NA -87.1121465 4.8582223
MCATE flipper_length_mm 184.00000 NA -87.1121465 4.8582223
MCATE flipper_length_mm 185.00000 NA -87.5561876 4.8490199
MCATE flipper_length_mm 185.00000 NA -87.5561876 4.8490199
MCATE flipper_length_mm 185.00000 NA -87.5561876 4.8490199
MCATE flipper_length_mm 186.00000 NA -87.5896421 4.8533031
MCATE flipper_length_mm 186.00000 NA -87.5896421 4.8533031
MCATE flipper_length_mm 187.00000 NA -87.3088562 4.8424872
MCATE flipper_length_mm 187.00000 NA -87.3088562 4.8424872
MCATE flipper_length_mm 187.00000 NA -87.3088562 4.8424872
MCATE flipper_length_mm 187.00000 NA -87.3088562 4.8424872
MCATE flipper_length_mm 187.00000 NA -87.3088562 4.8424872
MCATE flipper_length_mm 188.00000 NA -86.3779400 4.8287744
MCATE flipper_length_mm 188.00000 NA -86.3779400 4.8287744
MCATE flipper_length_mm 189.00000 NA -85.2210578 4.8397490
MCATE flipper_length_mm 189.00000 NA -85.2210578 4.8397490
MCATE flipper_length_mm 189.94545 NA -83.7839687 4.8795255
MCATE flipper_length_mm 190.00000 NA -83.6872517 4.8828360
MCATE flipper_length_mm 190.00000 NA -83.6872517 4.8828360
MCATE flipper_length_mm 190.00000 NA -83.6872517 4.8828360
MCATE flipper_length_mm 190.00000 NA -83.6872517 4.8828360
MCATE flipper_length_mm 190.00000 NA -83.6872517 4.8828360
MCATE flipper_length_mm 190.00000 NA -83.6872517 4.8828360
MCATE flipper_length_mm 191.00000 NA -81.5496504 4.9110017
MCATE flipper_length_mm 191.00000 NA -81.5496504 4.9110017
MCATE flipper_length_mm 191.00000 NA -81.5496504 4.9110017
MCATE flipper_length_mm 191.00000 NA -81.5496504 4.9110017
MCATE flipper_length_mm 191.14545 NA -81.1819124 4.9149708
MCATE flipper_length_mm 192.00000 NA -78.7620952 4.9592044
MCATE flipper_length_mm 192.00000 NA -78.7620952 4.9592044
MCATE flipper_length_mm 193.00000 NA -75.5889519 5.0574950
MCATE flipper_length_mm 193.00000 NA -75.5889519 5.0574950
MCATE flipper_length_mm 193.00000 NA -75.5889519 5.0574950
MCATE flipper_length_mm 193.00000 NA -75.5889519 5.0574950
MCATE flipper_length_mm 193.27273 NA -74.6401159 5.0881638
MCATE flipper_length_mm 194.00000 NA -72.2817319 5.1882399
MCATE flipper_length_mm 195.00000 NA -68.3516057 5.5364042
MCATE flipper_length_mm 195.00000 NA -68.3516057 5.5364042
MCATE flipper_length_mm 195.00000 NA -68.3516057 5.5364042
MCATE flipper_length_mm 195.00000 NA -68.3516057 5.5364042
MCATE flipper_length_mm 195.00000 NA -68.3516057 5.5364042
MCATE flipper_length_mm 195.00000 NA -68.3516057 5.5364042
MCATE flipper_length_mm 196.00000 NA -63.0852968 6.0006517
MCATE flipper_length_mm 196.00000 NA -63.0852968 6.0006517
MCATE flipper_length_mm 196.00000 NA -63.0852968 6.0006517
MCATE flipper_length_mm 197.00000 NA -57.2403004 6.4668317
MCATE flipper_length_mm 197.00000 NA -57.2403004 6.4668317
MCATE flipper_length_mm 197.00000 NA -57.2403004 6.4668317
MCATE flipper_length_mm 197.52727 NA -53.8838406 6.7176556
MCATE flipper_length_mm 198.00000 NA -50.5593677 6.9507256
MCATE flipper_length_mm 198.00000 NA -50.5593677 6.9507256
MCATE flipper_length_mm 199.00000 NA -42.9372452 7.4906069
MCATE flipper_length_mm 199.00000 NA -42.9372452 7.4906069
MCATE flipper_length_mm 200.00000 NA -34.3025992 8.0183269
MCATE flipper_length_mm 200.63636 NA -28.1004811 8.3239127
MCATE flipper_length_mm 201.00000 NA -24.2083752 8.4693529
MCATE flipper_length_mm 201.67273 NA -16.2736609 8.8306797
MCATE flipper_length_mm 202.00000 NA -12.8925036 8.8970822
MCATE flipper_length_mm 203.00000 NA -1.8264975 9.1106423
MCATE flipper_length_mm 204.45455 NA 15.2130631 9.4308429
MCATE flipper_length_mm 205.74545 NA 30.4231450 9.3477516
MCATE flipper_length_mm 207.76364 NA 52.3789952 8.4625422
MCATE flipper_length_mm 208.00000 NA 54.6595290 8.3247952
MCATE flipper_length_mm 208.00000 NA 54.6595290 8.3247952
MCATE flipper_length_mm 209.00000 NA 63.8615860 7.7963293
MCATE flipper_length_mm 209.00000 NA 63.8615860 7.7963293
MCATE flipper_length_mm 210.00000 NA 72.8128141 7.2960595
MCATE flipper_length_mm 210.00000 NA 72.8128141 7.2960595
MCATE flipper_length_mm 210.00000 NA 72.8128141 7.2960595
MCATE flipper_length_mm 210.00000 NA 72.8128141 7.2960595
MCATE flipper_length_mm 210.92727 NA 80.3978438 6.8204528
MCATE flipper_length_mm 212.00000 NA 88.0751382 6.3665168
MCATE flipper_length_mm 212.00000 NA 88.0751382 6.3665168
MCATE flipper_length_mm 212.98182 NA 94.0113883 6.0779102
MCATE flipper_length_mm 213.00000 NA 94.0995499 6.0731756
MCATE flipper_length_mm 214.00000 NA 98.6134911 5.7907552
MCATE flipper_length_mm 214.00000 NA 98.6134911 5.7907552
MCATE flipper_length_mm 215.00000 NA 102.2300003 5.5819154
MCATE flipper_length_mm 215.00000 NA 102.2300003 5.5819154
MCATE flipper_length_mm 215.00000 NA 102.2300003 5.5819154
MCATE flipper_length_mm 215.00000 NA 102.2300003 5.5819154
MCATE flipper_length_mm 216.00000 NA 103.9688202 5.4491784
MCATE flipper_length_mm 216.00000 NA 103.9688202 5.4491784
MCATE flipper_length_mm 217.00000 NA 105.1704096 5.5613375
MCATE flipper_length_mm 217.18182 NA 105.4134539 5.6047295
MCATE flipper_length_mm 218.00000 NA 106.2723356 5.8331711
MCATE flipper_length_mm 219.00000 NA 106.8092732 6.0624841
MCATE flipper_length_mm 219.00000 NA 106.8092732 6.0624841
MCATE flipper_length_mm 220.00000 NA 106.8759950 6.3217678
MCATE flipper_length_mm 220.00000 NA 106.8759950 6.3217678
MCATE flipper_length_mm 220.29091 NA 106.9145253 6.3756698
MCATE flipper_length_mm 221.00000 NA 106.9439369 6.5051028
MCATE flipper_length_mm 222.00000 NA 106.8105011 6.6978634
MCATE flipper_length_mm 222.00000 NA 106.8105011 6.6978634
MCATE flipper_length_mm 223.00000 NA 106.1799319 6.9200665
MCATE flipper_length_mm 224.00000 NA 104.4884651 7.0570991
MCATE flipper_length_mm 225.00000 NA 101.7541570 7.2303154
VIMP species NA NA 0.0691612 0.0142278
VIMP island NA NA 0.0000000 0.0016084
VIMP sex NA NA 0.0000000 0.0004364
VIMP year NA NA 0.0000000 0.0008215
VIMP bill_length_mm NA NA 0.0001690 0.0021862
VIMP bill_depth_mm NA NA 0.0000000 0.0012392
VIMP flipper_length_mm NA NA 0.0000000 0.0018511
MSE food_consumed_g NA Control Response 28.2706200 2.4594294
MSE food_consumed_g NA Treatment Response 789.3154926 92.0759064
SL risk SL.glm_All NA Control Response 29.7902204 0.7916014
SL risk custom_SL.glmnet_0_All NA Control Response 27.8852636 1.1472004
SL risk custom_SL.glmnet_1_All NA Control Response 28.5618137 1.5827518
SL risk SL.glm_All NA Treatment Response 883.5822027 48.8110298
SL risk custom_SL.glmnet_0_All NA Treatment Response 906.1724175 47.3910093
SL risk custom_SL.glmnet_1_All NA Treatment Response 804.0924028 46.6406733
SL coefficient SL.glm_All NA Control Response 0.2130465 0.1618604
SL coefficient custom_SL.glmnet_0_All NA Control Response 0.2079046 0.2079046
SL coefficient custom_SL.glmnet_1_All NA Control Response 0.5790489 0.2514606
SL coefficient SL.glm_All NA Treatment Response 0.0080235 0.0080235
SL coefficient custom_SL.glmnet_0_All NA Treatment Response 0.0230536 0.0230536
SL coefficient custom_SL.glmnet_1_All NA Treatment Response 0.9689229 0.0217377
SATE NA NA NA -8.5934656 5.3139105

The estimand column denotes the class of Quantities of Interest to be estimated, and include such values as “MCATE” and “VIMP”. The term column denotes the covariate being referred to (if relevant) in the results. For example, a calculated MCATE refers to a particular covariate, indicated by this column. The columns value and level refer to quantities which indicate a more granular division within a particular term, with the former representing a numeric value while the latter indicates a categorical one. For example, in the case of the MCATE, if the covariate is discrete, the value at which the MCATE was calculated would be in the level column, while if it were continuous, it would be in the value column. Each of these columns is, therefore, type-stable. The final two columns for estimate and std_error are self explanatory. The details on their calculation lie with the particular Quantity of Interest requested.

Plotting MCATEs

It’s then a simple matter to plot results using code like the following:

filter(results, estimand == "MCATE", is.na(value)) %>%
ggplot(aes(level, estimate)) +
geom_point() +
geom_linerange(aes(ymin = estimate - 1.96 * std_error, ymax = estimate + 1.96 * std_error)) +
geom_hline(yintercept = 0, linetype = "dashed") +
coord_flip() +
facet_wrap(~term, scales = "free_y")

filter(results, estimand == "MCATE", is.na(level)) %>%
ggplot(aes(value, estimate)) +
geom_line() +
geom_ribbon(
  aes(ymin = estimate - 1.96 * std_error, ymax = estimate + 1.96 * std_error),
  alpha = 0.5
) +
geom_hline(yintercept = 0, linetype = "dashed") +
scale_x_continuous("Covariate value") +
scale_y_continuous("CATE") +
coord_flip() +
facet_wrap(~term, scales = "free_y")

Plotting diagnostics

Similarly, it’s easy to plot diagnostic information:

filter(results, estimand == "SL risk") %>%
ggplot(aes(reorder(term, estimate), estimate)) +
geom_point() +
geom_linerange(
  aes(ymin = estimate - 1.96 * std_error, ymax = estimate + 1.96 * std_error),
  alpha = 0.5
) +
geom_hline(yintercept = 0, linetype = "dashed") +
scale_x_discrete("") +
scale_y_continuous("Risk") +
facet_wrap(~level) +
coord_flip()

filter(results, estimand == "SL coefficient") %>%
ggplot(aes(reorder(term, estimate), estimate)) +
geom_point() +
geom_linerange(
  aes(ymin = estimate - 1.96 * std_error, ymax = estimate + 1.96 * std_error),
  alpha = 0.5
) +
geom_hline(yintercept = 0, linetype = "dashed") +
scale_x_discrete("") +
scale_y_continuous("Coefficient") +
facet_wrap(~level) +
coord_flip()

Plotting VIMP

And finally, we can examine the variable importance measure of Williamson et al. (2021) when applied to a joint model of the pseudo-outcome:

filter(results, estimand == "VIMP") %>%
ggplot(aes(reorder(term, estimate), estimate)) +
geom_point() +
geom_linerange(
  aes(ymin = estimate - 1.96 * std_error, ymax = estimate + 1.96 * std_error),
  alpha = 0.5
) +
geom_hline(yintercept = 0, linetype = "dashed") +
scale_x_discrete("") +
scale_y_continuous("Reduction in R²") +
coord_flip()

Replicating analyses

One of the biggest benefits of tidyhte is its ability to repeat similar analyses.

As an example, to perform the same analysis used above on a different outcome would require the following code:

penguins %<>% produce_plugin_estimates(
  # outcome
  body_mass_g,
  # treatment
  treatment,
  # covariates
  species, island, sex, year, bill_length_mm, bill_depth_mm, flipper_length_mm
) %>%
construct_pseudo_outcomes(body_mass_g, treatment) %>%
estimate_QoI(
  species, island, sex, year, bill_length_mm, bill_depth_mm, flipper_length_mm
) -> results_mass

All the same quantities can be easily plotted for this outcome as well, and results may be joined together conveniently.

results_all <- bind_rows(
  results %>% mutate(outcome = "food_consumed_g"),
  results_mass %>% mutate(outcome = "body_mass_g")
)

By allowing the user to flexibly compose HTE estimators, it drastically reduces the amount of work necessary for a typical HTE analysis which by nature tends to involve multiple moderators, models and outcomes.

Conclusion

This paper has introduced the concepts underlying the tidyhte package and given examples as to how the package can be used. In general, the package is written in a sufficiently general way that some features can be added with relatively little work. For instance, adding new plugin models is as simple as providing some information on configuration as well as standardizing train / predict methods. This is similarly true for providing new ways to summarize CATEs for plotting, which is handled in much the same way. This makes it relatively easy for methodologists to fit in their preferred fitting methods and thereby extending tidyhte’s functionality.

Acknowledgements

We gratefully acknowledge the collaboration with the US 2020 Facebook and Instagram Election Project for being a testbed for the initial versions of tidyhte, particularly Pablo Barberá at Meta. We received no financial support for this software project.

References

Abadie, Alberto, Susan Athey, Guido W Imbens, and Jeffrey M Wooldridge. 2023. “When Should You Adjust Standard Errors for Clustering?” The Quarterly Journal of Economics 138 (1): 1–35.
Calonico, Sebastian, Matias D Cattaneo, and Max H Farrell. 2019. “Nprobust: Nonparametric Kernel-Based Estimation and Robust Bias-Corrected Inference.” arXiv Preprint arXiv:1906.00198.
Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. 2018. “Double/Debiased Machine Learning for Treatment and Structural Parameters.” The Econometrics Journal 21 (1): C1–68.
Fan, Jianqing, and Irene Gijbels. 2018. Local Polynomial Modelling and Its Applications: Monographs on Statistics and Applied Probability 66. Routledge.
Higgins, Michael J, Fredrik Sävje, and Jasjeet S Sekhon. 2016. “Improving Massive Experiments with Threshold Blocking.” Proceedings of the National Academy of Sciences 113 (27): 7369–76.
Horst, Allison M, Alison Presmanes Hill, and Kristen B Gorman. 2020. “Allisonhorst/Palmerpenguins: V0.1.0.” Zenodo. https://doi.org/10.5281/zenodo.3960218.
Kennedy, Edward H. 2020. “Optimal Doubly Robust Estimation of Heterogeneous Causal Effects.” arXiv Preprint arXiv:2004.14497.
Robins, James M, Andrea Rotnitzky, and Lue Ping Zhao. 1995. “Analysis of Semiparametric Regression Models for Repeated Outcomes in the Presence of Missing Data.” Journal of the American Statistical Association 90 (429): 106–21.
Tsiatis, Anastasios A. 2006. “Semiparametric Theory and Missing Data.”
Tsiatis, Anastasios A, Marie Davidian, Min Zhang, and Xiaomin Lu. 2008. “Covariate Adjustment for Two-Sample Treatment Comparisons in Randomized Clinical Trials: A Principled yet Flexible Approach.” Statistics in Medicine 27 (23): 4658–77.
Van der Laan, Mark J, MJ Laan, and James M Robins. 2003. Unified Methods for Censored Longitudinal Data and Causality. Springer Science & Business Media.
Van der Laan, Mark J, Eric C Polley, and Alan E Hubbard. 2007. “Super Learner.” Statistical Applications in Genetics and Molecular Biology 6 (1).
Williamson, Brian D, Peter B Gilbert, Marco Carone, and Noah Simon. 2021. “Nonparametric Variable Importance Assessment Using Machine Learning Techniques.” Biometrics 77 (1): 9–22.
Zou, Hui, and Trevor Hastie. 2005. “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society Series B: Statistical Methodology 67 (2): 301–20.