Interface

class bartz.BART.gbart(x_train, y_train, *, x_test=None, usequants=False, sigest=None, sigdf=3, sigquant=0.9, k=2, power=2, base=0.95, maxdepth=6, lamda=None, offset=None, ntree=200, numcut=255, ndpost=1000, nskip=100, keepevery=1, printevery=100, seed=0, initkw={})[source]

Nonparametric regression with Bayesian Additive Regression Trees (BART).

Regress y_train on x_train with a latent mean function represented as a sum of decision trees. The inference is carried out by sampling the posterior distribution of the tree ensemble with an MCMC.

Parameters:
x_trainarray (p, n) or DataFrame

The training predictors.

y_trainarray (n,) or Series

The training responses.

x_testarray (p, m) or DataFrame, optional

The test predictors.

usequantsbool, default False

Whether to use predictors quantiles instead of a uniform grid to bin predictors.

sigestfloat, optional

An estimate of the residual standard deviation on y_train, used to set lamda. If not specified, it is estimated by linear regression. If y_train has less than two elements, it is set to 1. If n <= p, it is set to the variance of y_train. Ignored if lamda is specified.

sigdfint, default 3

The degrees of freedom of the scaled inverse-chisquared prior on the noise variance.

sigquantfloat, default 0.9

The quantile of the prior on the noise variance that shall match sigest to set the scale of the prior. Ignored if lamda is specified.

kfloat, default 2

The inverse scale of the prior standard deviation on the latent mean function, relative to half the observed range of y_train. If y_train has less than two elements, k is ignored and the scale is set to 1.

powerfloat, default 2
basefloat, default 0.95

Parameters of the prior on tree node generation. The probability that a node at depth d (0-based) is non-terminal is base / (1 + d) ** power.

maxdepthint, default 6

The maximum depth of the trees. This is 1-based, so with the default maxdepth=6, the depths of the levels range from 0 to 5.

lamdafloat, optional

The scale of the prior on the noise variance. If lamda==1, the prior is an inverse chi-squared scaled to have harmonic mean 1. If not specified, it is set based on sigest and sigquant.

offsetfloat, optional

The prior mean of the latent mean function. If not specified, it is set to the mean of y_train. If y_train is empty, it is set to 0.

ntreeint, default 200

The number of trees used to represent the latent mean function.

numcutint, default 255

If usequants is False: the exact number of cutpoints used to bin the predictors, ranging between the minimum and maximum observed values (excluded).

If usequants is True: the maximum number of cutpoints to use for binning the predictors. Each predictor is binned such that its distribution in x_train is approximately uniform across bins. The number of bins is at most the number of unique values appearing in x_train, or numcut + 1.

Before running the algorithm, the predictors are compressed to the smallest integer type that fits the bin indices, so numcut is best set to the maximum value of an unsigned integer type.

ndpostint, default 1000

The number of MCMC samples to save, after burn-in.

nskipint, default 100

The number of initial MCMC samples to discard as burn-in.

keepeveryint, default 1

The thinning factor for the MCMC samples, after burn-in.

printeveryint, default 100

The number of iterations (including skipped ones) between each log.

seedint or jax random key, default 0

The seed for the random number generator.

Notes

This interface imitates the function gbart from the R package BART, but with these differences:

  • If x_train and x_test are matrices, they have one predictor per row instead of per column.

  • If usequants=False, R BART switches to quantiles anyway if there are less predictor values than the required number of bins, while bartz always follows the specification.

  • The error variance parameter is called lamda instead of lambda.

  • rm_const is always False.

  • The default numcut is 255 instead of 100.

  • A lot of functionality is missing (variable selection, discrete response).

  • There are some additional attributes, and some missing.

The linear regression used to set sigest adds an intercept.

Attributes:
yhat_trainarray (ndpost, n)

The conditional posterior mean at x_train for each MCMC iteration.

yhat_train_meanarray (n,)

The marginal posterior mean at x_train.

yhat_testarray (ndpost, m)

The conditional posterior mean at x_test for each MCMC iteration.

yhat_test_meanarray (m,)

The marginal posterior mean at x_test.

sigmaarray (ndpost,)

The standard deviation of the error.

first_sigmaarray (nskip,)

The standard deviation of the error in the burn-in phase.

offsetfloat

The prior mean of the latent mean function.

scalefloat

The prior standard deviation of the latent mean function.

lamdafloat

The prior harmonic mean of the error variance.

sigestfloat or None

The estimated standard deviation of the error used to set lamda.

ntreeint

The number of trees.

maxdepthint

The maximum depth of the trees.

initkwdict

Additional arguments passed to mcmcstep.init.

Methods

predict(x_test)

Compute the posterior mean at x_test for each MCMC iteration.

predict(x_test)[source]

Compute the posterior mean at x_test for each MCMC iteration.

Parameters:
x_testarray (m, p) or DataFrame

The test predictors.

Returns:
yhat_testarray (ndpost, m)

The conditional posterior mean at x_test for each MCMC iteration.