bartz.testing

Testing utilities to generate synthetic datasets.

Data generation

gen_data(key, *, n, p[, k, lambda_, offset, ...])

Generate data from a quadratic multivariate DGP.

gen_params(key, *, p, k, q[, lambda_, ...])

Sample DGP coefficients and parameters (no dependence on n).

gen_data_from_params(key, params, *, n)

Sample predictors and outcomes given fixed params.

Params(partition, beta_shared, ...[, het_shape])

All quantities of the data-generating process that do not depend on n.

DGP(x, y, z, mulin_shared, mulin_separate, ...)

Output of gen_data / gen_data_from_params: sampled data and parameters.

QuantizedData(x, y, max_split)

Output of DGP.quantize: data in the format of bartz.mcmcstep.init.

Distributions

Standardized distributions (mean 0, variance 1) for predictors and coefficients.

Distr()

Family of standardized distributions: mean 0, variance 1.

Normal()

Standard Normal distribution.

Uniform()

Continuous uniform distribution, standardized: U(-sqrt(3), sqrt(3)).

DiscreteUniform(m)

Uniform distribution on m equispaced levels, standardized.

Scale distributions

Distributions of the per-predictor importance scales (unit mean square).

ScaleDistr()

Family of nonnegative scale distributions, normalized to E[s ** 2] = 1.

Constant()

Scales concentrated at 1 (uniform predictor importance).

Gamma(alpha)

Gamma(alpha) scales, rescaled to E[s ** 2] = 1.

SpikeSlab(pi)

Two-point distribution over the scales 0 and 1/sqrt(pi).