Changelog¶
0.7.0 Every woman knows the pain of deciding which predictors to throw away when her design matrix is full to the brim and the last season brought out new lagged features. Our 100% money-back guarantee Bayesian variable selection Dirichlet prior will pick out the best predictors for you automatically while the MCMC is running! (2025-07-07)¶
The highlight of this release is the implementation of variable selection.
Changes apparent through the
gbart
interface:Parameters
sparse
,theta
,rho
,a
,b
to activate and configure variable selectionThe MCMC logging shows a
fill
metric which is how much the tree arrays are filled, to check the trees are not being constrained by the maximum depth limitParameter
xinfo
to set manually the grid cutpoints for decision rulesFixed a stochastic bug with binary regression that would become likely with >1000 datapoints
Parameter
rm_const
to decide how to handle “blocked” predictors that have no possible decision rulesThe defaults of parameters
ntree
andkeepevery
now depend on whether the regression is continuous or binary, as in the R package BARTNew attributes of
gbart
objects, matching those of R’sBART::gbart
:prob_train
,prob_test
,prob_train_mean
,prob_test_mean
(for binary regression)sigma
includes burn-in samples, andfirst_sigma
is gonesigma_mean
(the mean is only over kept samples)varcount
,varcount_mean
varprob
,varprob_mean
The
bartz.debug
submodule is now officially public, the main functionality is:The class
debug_gbart
that adds some debugging methods togbart
trees_BART_to_bartz
to read trees in the format of R’s BART packagesample_prior
to sample from the BART prior
Changes to internals:
More typing in general
Changes to
run_mcmc
:MCMC traces are dataclasses instead of dictionaries
Switch back to using only one callback instead of two
I realized that
jax.lax.cond
makes the additional callback pointless. I previously had a cached heuristic to not uselax.cond
because it’s not efficiently vmappable, but for all practical uses in the MCMC it would be.
The callback accepts a jax random key, useful to implement additional MCMC steps
Simplified main/burn-in distinction for custom trace extractors
Changes to the MCMC internals:
min_points_per_decision_node
intended as replacement tomin_points_per_leaf
min_points_per_leaf
is still available to allow thegbart
interface to mimic R’s BARTThis different constraint is easier to take into account exactly in the Metropolis-Hastings ratio, while
min_points_per_leaf
leads to a deviation from the stated distribution
The tree structure MH should now match exactly the distributions written on paper, if
min_points_per_leaf
is not setThe tree structure MH never proposes zero-probability states, if
min_points_per_leaf
is not set
Valid usage should not produce infs or nans internally any more, so
jax.debug_infs
andjax.debug_nans
can be used
0.6.0 bruv bernoulli got gauss beat any time (2025-05-29)¶
binary regression with probit link
allow to interrupt the MCMC with ^C
logging shows how much the tree heaps are filled; if it’s above 50% you should definitely increase the number of trees and/or the maximum depth
BART.gbart(..., run_mcmc_kw=dict(...))
allows to pass additional arguments tomcmcloop.run_mcmc
option to disable logging
refactor internals
set offset in the MCMC state to avoid centering the responses
set leaf variance in the MCMC state to avoid rescaling responses
immutable dataclasses instead of dicts
improvements to
mcmcloop.run_mcmc
simpler to use signature with only three required parameters
the callback is allowed to carry a state and to modify the chain state (opt-in)
custom functions to change what is extracted from the state and put into the traces
two distinct callbacks, one invoked under jit and one out of it
more sophisticate default logging callback
complete type hints
improved documentation
jaxext.split
is a less error-prone alternative tojax.random.split
0.5.0 Our work promotes diversity in error variances by following heteroscedasticity best-practices such as multiplying the variance parameter by different factors (2025-05-16)¶
Heteroskedasticity with fixed weights.
The internal MCMC functions now follow the jax convention of having the
key
parameter first in the signature.Fixed a bug where the MCMC callback would hang indefinitely.
The library is now routinely tested with the least recent supported dependencies.
0.4.1 Told it was, nigh the end of times, version numbers of dependencies all would rise (2025-04-23)¶
Somehow 1 year went by before I had some time to spend on this software.
Somehow 1 year was sufficient to break most of my dependency specifications, despite having only 4 dependencies.
This release provides no new features, it’s only a quick fix to make bartz go well with the latest jax and numpy versions.
0.4.0 The real treasure was the Markov chain samples we made along the way (2024-04-16)¶
2x faster on GPU, due to parallelizing better the tree sampling step.
Uses less memory, now can do \(n=100\,000\) with \(10\,000\) trees on a V100. This was mostly an excessively large batch size for counting datapoints per leaf.
The Metropolis-Hastings ratio is saved only for the proposed move.
The grow and prune moves are merged into one object.
0.3.0¶
2-3x faster on CPU.
Uses less memory.
Add
initkw
argument toBART.gbart
for advanced configuration of the MCMC initialization.Modified the automatic determination of
sigest
inBART.gbart
to match the one of the R package.Add
usequants=False
option toBART.gbart
, which is now the default.New function
prepcovars.uniform_splits_from_matrix
.Add
sum_trees=False
option togrove.evaluate_forest
to evaluate separately each tree.Support non-batched arguments in
jaxext.autobatch
.Fix a bug with empty arrays in
jaxext.autobatch
.New option in
mcmcstep.init
to save the acceptance ratios.Separate batching options for residuals and counts.
Sweeping changes to the tree move internals, more computations are parallel across trees.
Added support for
dbarts
in the unit tests.
0.2.1¶
Fix a bug that prevented using bart in a compiled function.
0.2.0¶
Rename
bartz.BART
tobartz.BART.gbart
.Expose submodule
bartz.jaxext
with auxiliary functions for jax.Shorter compilation time if no burnin or saved samples are drawn. This is useful when using the interface only to create the initial MCMC state, or when saving all samples to inspect the warm-up phase.
20x faster on GPU.
2x faster on CPU.
Use less temporary memory to quantilize covariates, avoiding out-of-memory errors on GPU.
0.1.0¶
Optimize the MCMC step to only traverse each tree once.
Now
bartz
runs at the same speed as the R packageBART
(tested at \(p=10\), \(n=100\ldots 10000\)).The MCMC functions are heavily changed, but the interface is the same.
0.0.1¶
BART
has attributesmaxdepth
,sigest
.Fix errors with scaling of noise variance prior.
Fix iteration report when
keepevery
is not 1.Lower required versions of dependencies to allow running on Colab.
0.0¶
First release.