bartz.prepcovars.GivenSplitsBinner

class bartz.prepcovars.GivenSplitsBinner(X, *, xinfo, key=None)[source]

Binner with cutpoints supplied directly in R BART xinfo format.

The cutpoints are taken verbatim from xinfo: a (p, m) matrix whose rows hold per-predictor sorted cutpoints, with NaN-padded trailing entries marking unused capacity. Internally NaNs are replaced by the maximum representable value in the dtype of xinfo, and max_split is set to the count of non-NaN entries per row, so binning behaves as if the row had been declared with only its non-NaN cutpoints.

Parameters:
  • X (Real[Array, 'p n']) – Training predictors. Used only to validate the shape of xinfo.

  • xinfo (Float[Array, 'p m']) – A (p, m) matrix of cutpoints. Each row holds a sorted list of cutpoints for one predictor, optionally padded on the right with NaN.

  • key (Key[Array, ''] | None, default: None) – Accepted for protocol uniformity; unused.

Raises:

ValueError – If xinfo is not 2D, or if its first dimension does not match X.shape[0].

max_split: UInt[Array, 'p']

The number of cutpoints actually used for each of the p predictors.

bin(X)[source]

Map predictors to bin indices using the cutpoints chosen at construction.

Parameters:

X (Real[Array, 'p n']) – A matrix with p predictors and n observations. Must have the same number of predictors as the training matrix passed to the constructor.

Returns:

UInt[Array, 'p n'] – Quantized X with minimal data type.