bartz.SparseConfig

class bartz.SparseConfig(theta=None, a=0.5, b=1.0, rho=None, augment=True, enabled=True)[source]

Configuration of a sparsity-inducing variable selection prior.

This is the prior of [1]. Pass an instance to the sparse argument of Bart to activate variable selection on the predictors. The prior on the choice of predictor for each decision rule is

\[(s_1, \ldots, s_p) \sim \operatorname{Dirichlet}(\mathtt{theta}/p, \ldots, \mathtt{theta}/p).\]

If theta is not specified, it’s a priori distributed according to

\[\frac{\mathtt{theta}}{\mathtt{theta} + \mathtt{rho}} \sim \operatorname{Beta}(\mathtt{a}, \mathtt{b}).\]

References

theta: float | Float[Array, ''] | Float[ndarray, ''] | None = None

Concentration of the Dirichlet prior. If not specified, it is sampled from a Beta prior parametrized by a, b and rho. If set directly, it should be in the ballpark of the predictor count p or lower.

a: float | Float[Array, ''] | Float[ndarray, ''] = 0.5

Shape parameter of the Beta prior on theta / (theta + rho).

b: float | Float[Array, ''] | Float[ndarray, ''] = 1.0

Shape parameter of the Beta prior on theta / (theta + rho).

rho: float | Float[Array, ''] | Float[ndarray, ''] | None = None

Scale of the Beta prior on theta. If not specified, set to the number of predictors p. Lower values prefer more sparsity.

augment: bool = True

Whether to account exactly for the decision rules forbidden by the ancestors of each node when updating the variable selection probabilities, using data augmentation. On by default. Setting it to False ignores the forbidden rules, which is faster but only approximate. This matters most with few predictors with few cutpoints each, where the same predictor cannot be re-used down a branch.

enabled: bool = True

Whether variable selection is active.