Gets estimates of tau(x) using a trained causal forest.
# S3 method for causal_forest predict( object, newdata = NULL, linear.correction.variables = NULL, ll.lambda = NULL, ll.weight.penalty = FALSE, num.threads = NULL, estimate.variance = FALSE, ... )
object | The trained forest. |
---|---|
newdata | Points at which predictions should be made. If NULL, makes out-of-bag predictions on the training set instead (i.e., provides predictions at Xi using only trees that did not use the i-th training example). Note that this matrix should have the number of columns as the training matrix, and that the columns must appear in the same order. |
linear.correction.variables | Optional subset of indexes for variables to be used in local linear prediction. If NULL, standard GRF prediction is used. Otherwise, we run a locally weighted linear regression on the included variables. Please note that this is a beta feature still in development, and may slow down prediction considerably. Defaults to NULL. |
ll.lambda | Ridge penalty for local linear predictions. Defaults to NULL and will be cross-validated. |
ll.weight.penalty | Option to standardize ridge penalty by covariance (TRUE), or penalize all covariates equally (FALSE). Penalizes equally by default. |
num.threads | Number of threads used in training. If set to NULL, the software automatically selects an appropriate amount. |
estimate.variance | Whether variance estimates for hattau(x) are desired (for confidence intervals). |
... | Additional arguments (currently ignored). |
Vector of predictions, along with estimates of the error and (optionally) its variance estimates. Column 'predictions' contains estimates of the conditional average treatent effect (CATE). The square-root of column 'variance.estimates' is the standard error of CATE. For out-of-bag estimates, we also output the following error measures. First, column 'debiased.error' contains estimates of the 'R-loss' criterion, (See Nie and Wager 2017 for a justification). Second, column 'excess.error' contains jackknife estimates of the Monte-carlo error (Wager, Hastie, Efron 2014), a measure of how unstable estimates are if we grow forests of the same size on the same data set. The sum of 'debiased.error' and 'excess.error' is the raw error attained by the current forest, and 'debiased.error' alone is an estimate of the error attained by a forest with an infinite number of trees. We recommend that users grow enough forests to make the 'excess.error' negligible.
# \donttest{ # Train a causal forest. n <- 100 p <- 10 X <- matrix(rnorm(n * p), n, p) W <- rbinom(n, 1, 0.5) Y <- pmax(X[, 1], 0) * W + X[, 2] + pmin(X[, 3], 0) + rnorm(n) c.forest <- causal_forest(X, Y, W) # Predict using the forest. X.test <- matrix(0, 101, p) X.test[, 1] <- seq(-2, 2, length.out = 101) c.pred <- predict(c.forest, X.test) # Predict on out-of-bag training samples. c.pred <- predict(c.forest) # Predict with confidence intervals; growing more trees is now recommended. c.forest <- causal_forest(X, Y, W, num.trees = 500) c.pred <- predict(c.forest, X.test, estimate.variance = TRUE) # }