Estimating conditional means • grf

library(grf)

Conditional means and average treatment estimates

grf computes average treatments effects based on a double robust correction (giving rise to an augmented inverse-propensity weighted average treatment effect). For a causal forest with binary treatment the exact expression is (equation (8) of Athey and Wager, 2019):

\(\hat \Gamma_i = \hat \tau^{(-i)}(X_i) + \frac{W_i - \hat e^{(-i)}(X_i)}{\hat e^{(-i)}(X_i)[1 - \hat e^{(-i)}(X_i)]} [Y_i - \hat \mu^{(-i)}(X_i, W_i)]\)

where:

\(\hat \tau^{(-i)}(X_i)\) is the treatment effect estimate (returned by grf’s predict method). Using the potential outcomes notation this is by definition \(E[Y_i(1) - Y_i(0) | X_i] = \mu(X_i, 0) - \mu(X_i, 1)\).
\(W_i\) is the binary treatment indicator for subject \(i\).
\(\hat e^{(-i)}(X_i)\) is the propensity score for subject \(i\) (this is by default estimated by a regression forest on the treatment assignment).
\(Y_i\) is the realized outcome for subject \(i\).
\(\hat \mu^{(-i)}(X_i, W_i) = \hat m^{(-i)}(X_i) + [W_i - \hat e^{(-i)}(X_i)] \hat \tau^{(-i)}(X_i)\) is an estimate of the realized conditional mean for subject \(i\), where \(m(X_i) = E[Y | X_i]\) (this is by default estimated using a regression forest, marginalizing over treatment).

The superscript \(-i\) indicates cross-fitting, i.e. that the estimate is computed by leaving observation \(i\) out. This holds by construction for out-of-bag forest estimates.

To see how the last expression arises, note that we have:

\(m(X_i)\)

\(= E[Y | X_i]\)

\(= E[Y_i(0) | X_i] + E[W_i [Y_i(1) - Y_i(0)] | X_i]\)

\(= \mu(X_i, 0) + e(X_i) \tau(X_i)\)

where there last line is due to unconfoundedness (conditioning on the set of covariates \(X_i\), the potential outcomes are independent of the treatment).

We then obtain the following expressions for the counterfactual response surfaces:

\(\mu(X_i, 0) = m(X_i) - e(X_i) \tau(X_i)\)
\(\mu(X_i, 1) = \tau(X_i) + \mu(X_i, 0) = m(X_i) + [1 - e(X_i)] \tau(X_i)\)

These objects are all computed by the built-in ATE functions. The following code snippet illustrates how to manually obtain estimates of the conditional means \(E[Y | X_i, W_i]\):

n <- 250
p <- 5
X <- matrix(rnorm(n * p), n, p)
W <- rbinom(n, 1, 0.5)
Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n)

# These are estimates of m(X) = E[Y | X]
forest.Y <- regression_forest(X, Y)
Y.hat <- predict(forest.Y)$predictions

# These are estimates of the propensity score E[W | X]
forest.W <- regression_forest(X, W)
W.hat <- predict(forest.W)$predictions

c.forest <- causal_forest(X, Y, W, Y.hat, W.hat)
tau.hat <- predict(c.forest)$predictions

# E[Y | X, W = 0]
mu.hat.0 <- Y.hat - W.hat * tau.hat
# E[Y | X, W = 1]
mu.hat.1 <- Y.hat + (1 - W.hat) * tau.hat

References

Athey, Susan and Stefan Wager. Estimating Treatment Effects with Causal Forests: An Application. Observational Studies, 5, 2019. (arxiv)