R/forest_summary.R
test_calibration.Rd
Test calibration of the forest. Computes the best linear fit of the target
estimand using the forest prediction (on held-out data) as well as the mean
forest prediction as the sole two regressors. A coefficient of 1 for
`mean.forest.prediction` suggests that the mean forest prediction is correct,
whereas a coefficient of 1 for `differential.forest.prediction` additionally suggests
that the heterogeneity estimates from the forest are well calibrated.
The p-value of the `differential.forest.prediction` coefficient
also acts as an omnibus test for the presence of heterogeneity: If the coefficient
is significantly greater than 0, then we can reject the null of
no heterogeneity. For another class of omnnibus tests see rank_average_treatment_effect
.
test_calibration(forest, vcov.type = "HC3")
forest | The trained forest. |
---|---|
vcov.type | Optional covariance type for standard errors. The possible options are HC0, ..., HC3. The default is "HC3", which is recommended in small samples and corresponds to the "shortcut formula" for the jackknife (see MacKinnon & White for more discussion, and Cameron & Miller for a review). For large data sets with clusters, "HC0" or "HC1" are significantly faster to compute. |
A heteroskedasticity-consistent test of calibration.
Cameron, A. Colin, and Douglas L. Miller. "A practitioner's guide to cluster-robust inference." Journal of Human Resources 50, no. 2 (2015): 317-372.
Chernozhukov, Victor, Mert Demirer, Esther Duflo, and Ivan Fernandez-Val. "Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments." arXiv preprint arXiv:1712.04802 (2017).
MacKinnon, James G., and Halbert White. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties." Journal of Econometrics 29.3 (1985): 305-325.
# \donttest{ n <- 800 p <- 5 X <- matrix(rnorm(n * p), n, p) W <- rbinom(n, 1, 0.25 + 0.5 * (X[, 1] > 0)) Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n) forest <- causal_forest(X, Y, W) test_calibration(forest)#> #> Best linear fit using forest predictions (on held-out data) #> as well as the mean forest prediction as regressors, along #> with one-sided heteroskedasticity-robust (HC3) SEs: #> #> Estimate Std. Error t value Pr(>t) #> mean.forest.prediction 0.98229 0.22400 4.3852 6.571e-06 *** #> differential.forest.prediction 1.18599 0.21121 5.6151 1.356e-08 *** #> --- #> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 #># }