Data processing

bartz.prepcovars.quantilized_splits_from_matrix(X, max_bins)[source]

Determine bins that make the distribution of each predictor uniform.

Xarray (p, n)

A matrix with p predictors and n observations.


The maximum number of bins to produce.

splitsarray (p, m)

A matrix containing, for each predictor, the boundaries between bins. m is min(max_bins, n) - 1, which is an upper bound on the number of splits. Each predictor may have a different number of splits; unused values at the end of each row are filled with the maximum value representable in the type of X.

max_splitarray (p,)

The number of actually used values in each row of splits.

bartz.prepcovars.uniform_splits_from_matrix(X, num_bins)[source]

Make an evenly spaced binning grid.

Xarray (p, n)

A matrix with p predictors and n observations.


The number of bins to produce.

splitsarray (p, num_bins - 1)

A matrix containing, for each predictor, the boundaries between bins. The excluded endpoints are the minimum and maximum value in each row of X.

max_splitarray (p,)

The number of cutpoints in each row of splits, i.e., num_bins - 1.

bartz.prepcovars.bin_predictors(X, splits, **kw)[source]

Bin the predictors according to the given splits.

A value x is mapped to bin i iff splits[i - 1] < x <= splits[i].

Xarray (p, n)

A matrix with p predictors and n observations.

splitsarray (p, m)

A matrix containing, for each predictor, the boundaries between bins. m is the maximum number of splits; each row may have shorter actual length, marked by padding unused locations at the end of the row with the maximum value allowed by the type.


Additional arguments are passed to jax.numpy.searchsorted.

X_binnedint array (p, n)

A matrix with p predictors and n observations, where each predictor has been replaced by the index of the bin it falls into.