Omnibus evaluation of the quality of the random forest estimates via calibration.

Test calibration of the forest. Computes the best linear fit of the target estimand using the forest prediction (on held-out data) as well as the mean forest prediction as the sole two regressors. A coefficient of 1 for `mean.forest.prediction` suggests that the mean forest prediction is correct, whereas a coefficient of 1 for `differential.forest.prediction` additionally suggests that the heterogeneity estimates from the forest are well calibrated. The p-value of the `differential.forest.prediction` coefficient also acts as an omnibus test for the presence of heterogeneity: If the coefficient is significantly greater than 0, then we can reject the null of no heterogeneity. For another class of omnnibus tests see rank_average_treatment_effect.

test_calibration(forest, vcov.type = "HC3")

Arguments

forest	The trained forest.
vcov.type	Optional covariance type for standard errors. The possible options are HC0, ..., HC3. The default is "HC3", which is recommended in small samples and corresponds to the "shortcut formula" for the jackknife (see MacKinnon & White for more discussion, and Cameron & Miller for a review). For large data sets with clusters, "HC0" or "HC1" are significantly faster to compute.

Value

A heteroskedasticity-consistent test of calibration.

References

Cameron, A. Colin, and Douglas L. Miller. "A practitioner's guide to cluster-robust inference." Journal of Human Resources 50, no. 2 (2015): 317-372.

Chernozhukov, Victor, Mert Demirer, Esther Duflo, and Ivan Fernandez-Val. "Generic Machine Learning Inference on Heterogenous Treatment Effects in Randomized Experiments." arXiv preprint arXiv:1712.04802 (2017).

MacKinnon, James G., and Halbert White. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties." Journal of Econometrics 29.3 (1985): 305-325.

Examples

# \donttest{
n <- 800
p <- 5
X <- matrix(rnorm(n * p), n, p)
W <- rbinom(n, 1, 0.25 + 0.5 * (X[, 1] > 0))
Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n)
forest <- causal_forest(X, Y, W)
test_calibration(forest)
#> 
#> Best linear fit using forest predictions (on held-out data)
#> as well as the mean forest prediction as regressors, along
#> with one-sided heteroskedasticity-robust (HC3) SEs:
#> 
#>                                Estimate Std. Error t value    Pr(>t)    
#> mean.forest.prediction          0.98229    0.22400  4.3852 6.571e-06 ***
#> differential.forest.prediction  1.18599    0.21121  5.6151 1.356e-08 ***
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
# }