bartz.SparseConfig¶
- class bartz.SparseConfig(theta=None, a=0.5, b=1.0, rho=None, augment=True, enabled=True)[source]¶
Configuration of a sparsity-inducing variable selection prior.
This is the prior of [1]. Pass an instance to the
sparseargument ofBartto activate variable selection on the predictors. The prior on the choice of predictor for each decision rule is\[(s_1, \ldots, s_p) \sim \operatorname{Dirichlet}(\mathtt{theta}/p, \ldots, \mathtt{theta}/p).\]If
thetais not specified, it’s a priori distributed according to\[\frac{\mathtt{theta}}{\mathtt{theta} + \mathtt{rho}} \sim \operatorname{Beta}(\mathtt{a}, \mathtt{b}).\]References
- theta: float | Float[Array, ''] | Float[ndarray, ''] | None = None¶
Concentration of the Dirichlet prior. If not specified, it is sampled from a Beta prior parametrized by
a,bandrho. If set directly, it should be in the ballpark of the predictor count p or lower.
- a: float | Float[Array, ''] | Float[ndarray, ''] = 0.5¶
Shape parameter of the Beta prior on
theta / (theta + rho).
- b: float | Float[Array, ''] | Float[ndarray, ''] = 1.0¶
Shape parameter of the Beta prior on
theta / (theta + rho).
- rho: float | Float[Array, ''] | Float[ndarray, ''] | None = None¶
Scale of the Beta prior on
theta. If not specified, set to the number of predictors p. Lower values prefer more sparsity.
- augment: bool = True¶
Whether to account exactly for the decision rules forbidden by the ancestors of each node when updating the variable selection probabilities, using data augmentation. On by default. Setting it to
Falseignores the forbidden rules, which is faster but only approximate. This matters most with few predictors with few cutpoints each, where the same predictor cannot be re-used down a branch.