To cite GenericML
in publications, please use:
Welz M., Alfons, A., Demirer, M. and Chernozhukov, V. (2022).
GenericML
: Generic Machine Learning Inference.R
package version 0.2.3. URL: https://CRAN.R-project.org/package=GenericML.
R
implementation of Generic Machine Learning Inference on heterogeneous treatment effects in randomized experiments as proposed in Chernozhukov, Demirer, Duflo and Fernández-Val (2020). This package's workhorse is the mlr3
framework of Lang et al. (2019), which enables the specification of a wide variety of machine learners. The main functionality, GenericML()
, runs Algorithm 1 in Chernozhukov, Demirer, Duflo and Fernández-Val (2020) for a suite of user-specified machine learners. All steps in the algorithm are customizable via setup functions. Methods for printing and plotting are available for objects returned by GenericML()
. Parallel computing is supported.
The package GenericML
is on the CRAN (The Comprehensive R Archive Network), hence the latest release can be easily installed from the R
command line via
install.packages("GenericML")
To install the latest (possibly unstable) development version from GitHub, you can pull this repository and install it from the R
command line via
install.packages("devtools")
devtools::install_github("mwelz/GenericML")
If you already have the package devtools
installed, you can skip the first line.
If you experience any bugs or issues or if you have any suggestions for additional features, please submit an issue via the Issues tab of this repository. Please have a look at existing issues first to see if your problem or feature request has already been discussed.
If you want to contribute to the package, you can fork this repository and create a pull request after implementing the desired functionality.
If you need help using the package, or if you are interested in collaborations related to this project, please get in touch with the package maintainer.
We generate n = 1000
samples from a randomized experiment which exhibit substantial treatment effect heterogeneity. Hence, Generic ML should indicate the existence of treatment effect heterogeneity and also capture the heterogeneity patterns.
library(GenericML)
### 1. Data Generation ----
set.seed(31684591)
n <- 1000 # number of observations
p <- 3 # number of covariates
D <- rbinom(n, 1, 0.5) # random treatment assignment
Z <- matrix(runif(n*p), n, p) # design matrix
colnames(Z) <- paste0("z", 1:p) # column names
Y0 <- as.numeric(Z %*% rexp(p)) # potential outcome without treatment
### simulate heterogeneous treatment effect
# treatment effect increases with Z1, has a level shift along Z2 (at 0.5),
# and has no pattern along Z3
HTE <- 2 * Z[,1] + ifelse(Z[,2] >= 0.5, 1, -1)
ATE <- mean(HTE) # average treatment effect (about 0.979 here)
Y1 <- HTE + Y0 # potential outcome under treatment
Y <- ifelse(D == 1, Y1, Y0) # observed outcome
### 2. Prepare the arguments for GenericML() ----
## quantile cutoffs for the GATES grouping of the estimated CATEs
quantile_cutoffs <- c(0.25, 0.5, 0.75) # 25%, 50%, and 75% quantiles
## specify the learner of the propensity score (non-penalized logistic regression here).
# Propensity scores can also directly be supplied.
learner_propensity_score <- "mlr3::lrn('glmnet', lambda = 0, alpha = 1)"
# specify the considered learners of the BCA and the CATE (here: lasso, random forest, and SVM)
learners_GenericML <- c("lasso",
"mlr3::lrn('ranger', num.trees = 100)",
"mlr3::lrn('svm')")
# specify the data that shall be used for the CLAN
# here, we use all variables of Z and uniformly distributed random noise
Z_CLAN <- cbind(Z, random = runif(n))
# specify the number of splits (many to rule out seed-dependence of results)
num_splits <- 1000
# specify if a HT transformation shall be used when estimating BLP and GATES
HT <- FALSE
## A list controlling the variables that shall be used in the
# matrix X1 for the BLP and GATES regressions.
X1_BLP <- setup_X1()
X1_GATES <- setup_X1()
# consider differences between group K (most affected) with group 1
diff_GATES <- setup_diff(subtract_from = "most",
subtracted = 1)
diff_CLAN <- setup_diff(subtract_from = "most",
subtracted = 1)
# specify the significance level
significance_level <- 0.05
## specify minimum variation of predictions before
# Gaussian noise with variance var(Y)/20 is added.
min_variation <- 1e-05
## specify which estimator of the error covariance matrix shall be used
# in BLP and GATES (standard OLS covariance matrix estimator here)
vcov_BLP <- setup_vcov()
vcov_GATES <- setup_vcov()
# ensure that GATES parameters are monotonically increasing (required by theory)
monotonize <- TRUE
# the package allows for stratified sampling (don't use it here by keeping the arguments empty)
stratify <- setup_stratify()
# specify that there are no external weights to be used in the analysis
external_weights <- NULL
# specify the proportion of samples that shall be selected in the auxiliary set
prop_aux <- 0.5
# specify sampling strategy (possibly stratified). Here ordinary random sampling is used.
stratify <- setup_stratify()
# specify whether or not the splits and auxiliary results of the learners shall be stored
store_splits <- TRUE
store_learners <- FALSE # to save memory
# parallelization options
parallel <- TRUE
num_cores <- 8 # 8 cores
seed <- 123456
## NB: Note that the number of cores as well as your type of operating system
# (Unix vs. Windows) influences the random number stream.
# Thus, different choices of `num_cores` may lead to different results.
# Results of parallel processes are reproducible across all Unix systems,
# but might deviate on Windows systems.
### 3. Run the GenericML() function with these arguments ----
## runtime: ~40 seconds with R version 4.4.1 on a Dell XPS 13 9340
# (CPU: Intel Core Ultra 7 165H × 22, RAM: 32GB), running on Ubuntu 24.04 LTS.
# Returns a GenericML object.
genML <- GenericML(Z = Z, D = D, Y = Y,
learner_propensity_score = learner_propensity_score,
learners_GenericML = learners_GenericML,
num_splits = num_splits,
Z_CLAN = Z_CLAN,
HT = HT,
X1_BLP = X1_BLP,
X1_GATES = X1_GATES,
vcov_BLP = vcov_BLP,
vcov_GATES = vcov_GATES,
quantile_cutoffs = quantile_cutoffs,
diff_GATES = diff_GATES,
diff_CLAN = diff_CLAN,
prop_aux = prop_aux,
stratify = stratify,
significance_level = significance_level,
min_variation = min_variation,
parallel = parallel,
num_cores = num_cores,
seed = seed,
store_splits = store_splits,
store_learners = store_learners)
### 4. General results ----
## print
genML
# GenericML object with the following specifications:
# * Propensity score learner: mlr3::lrn('glmnet', lambda = 0, alpha = 1)
# * Generic ML learners: lasso, mlr3::lrn('ranger', num.trees = 100), mlr3::lrn('svm')
# * S = 1000 splits are used
# * No HT transformation is used
#
# The 90 % confidence intervals of the best BLP estimates are given by
# beta.1: ( 0.894 , 1.056 ) beta.2: ( 0.944 , 1.095 )
# The best learner for the BLP is mlr3::lrn('svm') (lambda of 1.172)
# The best learner for the GATES and CLAN is mlr3::lrn('ranger', num.trees = 100) (lambda.bar of 2.0818)
## get the medians of the estimated \Lambda and \bar{\Lambda} to find best learners
get_best(genML)
# lambda lambda.bar
# lasso 1.119 2.005
# mlr3::lrn('ranger', num.trees = 100) 1.162 2.082
# mlr3::lrn('svm') 1.172 2.063
# ---
# The best learner for BLP is mlr3::lrn('svm') with lambda = 1.172.
# The best learner for GATES and CLAN is mlr3::lrn('ranger', num.trees = 100) with lambda.bar = 2.0818.
We emphasize that the number of cores and your type of operating system affect the random number stream. Thus, different choices of num_cores
may lead to different results. Moreover, results of parallel processes are reproducible across all Unix systems, but might deviate on Windows systems. Consequently, the results below are only reproducible on Unix systems for num_cores = 8
.
We use the get_BLP()
acceessor function to extract the results of the BLP analysis. We can see from the print and plot that the true ATE of about 0.979 is contained in the 90% confidence interval of beta.1
. Moreover, we reject the null of no significance of beta.2
at any reasonable level, which is expected since there is substantial treatment effect heterogeneity.
results_BLP <- get_BLP(genML)
results_BLP # print method
# BLP generic targets
# ---
# Estimate CI lower CI upper p value
# beta.1 0.9750 0.8941 1.056 <2e-16 ***
# beta.2 1.0198 0.9442 1.095 <2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# ---
# Confidence level of confidence interval [CI lower, CI upper]: 90 %
plot(results_BLP) # plot method
There is treatment effect heterogeneity in the data generating process, so we expect a trend in the GATES per-group estimates as well as significance of group differences. This is indeed the case.
results_GATES <- get_GATES(genML)
results_GATES # print method
# GATES generic targets
# ---
# Estimate CI lower CI upper p value
# gamma.1 -0.4518 -0.6519 -0.256 8.16e-06 ***
# gamma.2 0.4986 0.2974 0.697 1.07e-06 ***
# gamma.3 1.4294 1.2280 1.631 < 2e-16 ***
# gamma.4 2.4103 2.2066 2.612 < 2e-16 ***
# gamma.4-gamma.1 2.8587 2.5754 3.142 < 2e-16 ***
# ---
# Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
# ---
# Confidence level of confidence interval [CI lower, CI upper]: 90 %
plot(results_GATES) # plot method
We first inspect the true treatment effect heterogeneity along the first variable:
plot(y = Z_CLAN[, "z1"], x = HTE, xlab = "True HTE", ylab = "Value of z1")
The heterogeneity exhibits a jump pattern along the first variable. Grouping the variable by treatment effect strength, we thus expect this pattern to be reflected in the variable grouping. Denoting by Gk
the variable’s within-group averages in group k
, we specifically expect that G1 < G2
and G3 < G4
, whereas the pair (G2,G4)
should have similar values and so should the pair (G1, G3)
, albeit at a higher level. Let’s see what CLAN suggests:
results_CLAN_z1 <- get_CLAN(genML, variable = "z1")
plot(results_CLAN_z1)
CLAN indeed captured the correct pattern and that G1 < G2
, G3 < G4
, while the individual pairs (G2,G4)
and (G1,G3)
exhibit similar values, with the latter pair at a higher level than the former pair.
We inspect the true treatment effect heterogeneity along the second variable:
plot(y = Z_CLAN[, "z2"], x = HTE, xlab = "True HTE", ylab = "Value of z2")
We clearly see the level shift at (1, 0.5). Thus, we expect that the two most affected groups should have a much stronger value of z2
than the two least affected groups. Moreover, the two groups G1
and G2
should have the same value of z2
and the two groups G3
and G4
should also have the same value. CLAN indeed captures this pattern:
results_CLAN_z2 <- get_CLAN(genML, variable = "z2")
plot(results_CLAN_z2)
We inspect the true treatment effect heterogeneity along the third variable:
plot(y = Z_CLAN[, "z3"], x = HTE, xlab = "True HTE", ylab = "Value of z3")
There is no heterogeneity pattern along z3
, so all CLAN groups should have roughly the same value. This is indeed the case:
results_CLAN_z3 <- get_CLAN(genML, variable = "z3")
plot(results_CLAN_z3)
We inspect the true treatment effect heterogeneity along the fourth variable which is just random noise:
plot(y = Z_CLAN[, "random"], x = HTE, xlab = "True HTE", ylab = "Value of 'random'")
There is no heterogeneity along random
, so all CLAN groups should have roughly the same value. This is indeed the case:
results_CLAN_random <- get_CLAN(genML, variable = "random")
plot(results_CLAN_random)
Max Welz ([email protected]), Andreas Alfons ([email protected]), Mert Demirer ([email protected]), and Victor Chernozhukov ([email protected]).