bartz.testing.Params¶

class bartz.testing.Params(partition, beta_shared, beta_separate, A_shared, A_separate, s, x_distr, beta_distr, A_distr, gamma_distr, error_distr, s_distr, q, lambda_, sigma2_lin, sigma2_quad, sigma2_eps, offset, sigma2_mean, sigma2_pop, sigma2_pri, gamma_shared, gamma_separate, het_strength, var_v, error_chol, outcome_type, het_shape=None)[source]¶

All quantities of the data-generating process that do not depend on n.

The data follows a multivariate quadratic model whose ingredients are drawn i.i.d. from configurable families: standardized ones (mean 0, variance 1, see Distr) for the predictors \(X\) and the coefficient draws \(b, a, g\), and a scale family (\(E[s^2] = 1\), see ScaleDistr) for the per-predictor importances \(s\). With observations \(i = 1, \ldots, n\), predictors \(j, j' = 1, \ldots, p\) and outcome components \(c = 1, \ldots, k\):

\begin{align} X_{ij} &\overset{\mathrm{i.i.d.}}\sim \mathtt{x\_distr}, \quad \kappa_X = E[X_{ij}^4], \\ s_j &\overset{\mathrm{i.i.d.}}\sim \mathtt{s\_distr}, \quad \mu_4 = E[s_j^4], \\ \{S_c\}_{c=1}^k &= \text{a random partition of } \{1, \ldots, p\}, \quad \lfloor p/k \rfloor \le |S_c| \le \lceil p/k \rceil, \\ \beta^{\mathrm{sh}}_j &= s_j \sqrt{\sigma^2_{\mathrm{lin}} / p}\; b_j, \quad b_j \overset{\mathrm{i.i.d.}}\sim \mathtt{beta\_distr}, \\ \beta^{\mathrm{sep}}_{cj} &= s_j\, \mathbb 1[j \in S_c] \sqrt{\sigma^2_{\mathrm{lin}} / (p/k)}\; b'_{cj}, \quad b'_{cj} \overset{\mathrm{i.i.d.}}\sim \mathtt{beta\_distr}, \\ P^{\mathrm{sh}}_{jj'} &= \mathbb 1[\min(|j - j'|,\, p - |j - j'|) \le q / 2], \quad q \bmod 2 = 0, \quad q < p, \\ P^{\mathrm{sep}}_{cjj'} &= \text{the same circular band within each } S_c \text{ by rank}, \quad q < \lfloor p/k \rfloor, \\ A^{\mathrm{sh}}_{jj'} &= s_j s_{j'}\, P^{\mathrm{sh}}_{jj'}\, \sigma_A\, a_{jj'}, \quad a_{jj'} \overset{\mathrm{i.i.d.}}\sim \mathtt{A\_distr}, \quad \sigma^2_A = \frac{\sigma^2_{\mathrm{quad}}} {p\, ((\kappa_X - 1)\mu_4 + q)}, \\ A^{\mathrm{sep}}_{cjj'} &= s_j s_{j'}\, P^{\mathrm{sep}}_{cjj'}\, \sigma'_A\, a'_{cjj'}, \quad a'_{cjj'} \overset{\mathrm{i.i.d.}}\sim \mathtt{A\_distr}, \quad \sigma'^2_A = \frac{\sigma^2_{\mathrm{quad}}} {(p/k)\, ((\kappa_X - 1)\mu_4 + q)}, \\ \mu^{\mathrm L}_{ci} &= \sqrt\lambda \textstyle\sum_j \beta^{\mathrm{sh}}_j X_{ij} + \sqrt{1 - \lambda} \textstyle\sum_j \beta^{\mathrm{sep}}_{cj} X_{ij}, \quad \lambda \in [0, 1], \\ \mu^{\mathrm Q}_{ci} &= \sqrt\lambda \textstyle\sum_{jj'} A^{\mathrm{sh}}_{jj'} X_{ij} X_{ij'} + \sqrt{1 - \lambda} \textstyle\sum_{jj'} A^{\mathrm{sep}}_{cjj'} X_{ij} X_{ij'}, \\ \gamma^{\mathrm{sh}}_j &= s_j\, g_j / \sqrt p, \quad g_j \overset{\mathrm{i.i.d.}}\sim \mathtt{gamma\_distr}, \quad \kappa_\gamma = E[g_j^4], \\ \gamma^{\mathrm{sep}}_{cj} &= s_j\, \mathbb 1[j \in S_c]\, g'_{cj} \big/ \sqrt{p/k}, \quad g'_{cj} \overset{\mathrm{i.i.d.}}\sim \mathtt{gamma\_distr}, \\ \eta_{ci} &= \begin{cases} \textstyle\sum_j \gamma^{\mathrm{sh}}_j X_{ij} & W \text{ scalar (same for every } c\text{)}, \\ \sqrt\lambda \textstyle\sum_j \gamma^{\mathrm{sh}}_j X_{ij} + \sqrt{1 - \lambda} \textstyle\sum_j \gamma^{\mathrm{sep}}_{cj} X_{ij} & W \text{ vector}, \end{cases} \\ W_{ci}^2 &= (1 - \rho) + \rho\, \eta_{ci}^2, \quad \rho \in [0, 1], \\ \mu_{ci} &= o_c + \mu^{\mathrm L}_{ci} + \mu^{\mathrm Q}_{ci}, \quad o_c \in \mathbb R, \\ U_{\cdot i} &\overset{\mathrm{i.i.d.}}\sim N(0, R), \quad R = \operatorname{corr}(\mathtt{error\_corr}), \quad R = I \text{ by default}, \\ \varepsilon_{ci} &= F^{-1}_{\mathtt{error\_distr}}(\Phi(U_{ci})) \quad (\text{Gaussian copula; } \varepsilon_{ci} = U_{ci} \text{ for the default } N), \\ Z_{ci} &= \mu_{ci} + \sigma_{\mathrm{eps}}\, W_{ci}\, \varepsilon_{ci}, \\ Y_{ci} &= \begin{cases} Z_{ci} & c \text{ continuous}, \\ \mathbb 1[Z_{ci} > 0] & c \text{ binary}. \end{cases} \end{align}

A binary component thresholds its own latent \(Z_{ci}\) at zero, so its success probability is \(F(\mu_{ci} / (\sigma_{\mathrm{eps}} W_{ci}))\), with \(F\) the (symmetric) error_distr CDF, the Normal \(\Phi\) by default (the latent shares \(\sigma^2_{\mathrm{eps}}\) with the continuous components, unlike the unit-variance probit convention of bartz.mcmcstep.init). Predictor families with \(\kappa_X = 1\) (binary predictors, DiscreteUniform with m=2) have constant squares, so they require \(q \ge 2\) to keep the quadratic budget well defined. Univariate outcomes are the \(k = 1\), \(\lambda = 1\) special case with the component axis dropped, and only the scalar \(W\) is available.

Writing \(\theta\) for all the sampled coefficients and \(E[\,\cdot \mid \theta]\), \(\operatorname{Var}[\,\cdot \mid \theta]\) for the population mean and variance of one dataset (over \(X\) and noise at fixed \(\theta\)), the derived expectations and variances are, for every \(\lambda\):

\begin{align} E[Z_{ci}] &= o_c, \\ \operatorname{Cov}[\mu^{\mathrm L}_{ci}, \mu^{\mathrm Q}_{ci} \mid \theta] &= \operatorname{Cov}[\mu^{\mathrm L}_{ci}, \mu^{\mathrm Q}_{ci}] = 0, \\ \operatorname{Cov}[\mu_{ci}, \mu_{c'i} \mid \theta] &= \operatorname{Cov}[\mu_{ci}, \mu_{c'i}] = 0 \quad (c \ne c',\ \lambda = 0), \\ \operatorname{Cov}[Z_{ci}, Z_{c'i} \mid \theta, X] &= \sigma^2_{\mathrm{eps}}\, W_{ci} W_{c'i}\, \operatorname{Cov}[\varepsilon_{ci}, \varepsilon_{c'i}], \quad |\operatorname{Cov}[\varepsilon_{ci}, \varepsilon_{c'i}]| \le |R_{cc'}|\ (\text{equality for } N), \\ E[\operatorname{Var}[Z_{ci} \mid \theta]] &= \sigma^2_{\mathrm{lin}} + \sigma^2_{\mathrm{quad}} + \sigma^2_{\mathrm{eps}} \quad \text{(expected population variance)}, \\ \operatorname{Var}[E[Z_{ci} \mid \theta]] &= \frac{\sigma^2_{\mathrm{quad}}\, \mu_4}{(\kappa_X - 1)\mu_4 + q} \quad \text{(variance of the expected mean)}, \\ \operatorname{Var}[Z_{ci}] &= E[\operatorname{Var}[Z_{ci} \mid \theta]] + \operatorname{Var}[E[Z_{ci} \mid \theta]] \quad \text{(prior variance)}, \\ E[W_{ci}^2] &= 1, \qquad \operatorname{Var}[W_{ci}^2] = \rho^2 (2 + e), \\ e &= \begin{cases} \dfrac{\kappa_\gamma \kappa_X \mu_4 - 3}{p} & W \text{ scalar}, \\[2ex] \dfrac{(\lambda^2 + (1 - \lambda)^2 k) (\kappa_\gamma \kappa_X \mu_4 - 3) + 6 \lambda (1 - \lambda) (\kappa_X \mu_4 - 1)}{p} \\ \quad + \dfrac{3 (1 - \lambda)^2\, r (k - r)}{p^2}, \quad r = p \bmod k & W \text{ vector}. \end{cases} \end{align}

The mathematical symbols and cases map to class attributes and gen_data settings as follows:

Symbol / case	Attribute / setting
\(X_{ij}\)	`DGP.x`
\(\kappa_X\)	`x_distr.kurtosis`
\(s_j\)	`s`
\(\mu_4\)	`s_distr.fourth_moment`
\(\{S_c\}\)	`partition`
\(b_j, b'_{cj}\)	`beta_distr`
\(\beta^{\mathrm{sh}}\)	`beta_shared`
\(\beta^{\mathrm{sep}}\)	`beta_separate`
\(a_{jj'}, a'_{cjj'}\)	`A_distr`
\(A^{\mathrm{sh}}\)	`A_shared`
\(A^{\mathrm{sep}}\)	`A_separate`
\(g_j, g'_{cj}\)	`gamma_distr`
\(\kappa_\gamma\)	`gamma_distr.kurtosis`
\(\gamma^{\mathrm{sh}}\)	`gamma_shared`
\(\gamma^{\mathrm{sep}}\)	`gamma_separate`
\(q\)	`q`
\(\lambda\)	`lambda_`
\(\sigma^2_{\mathrm{lin}}\)	`sigma2_lin`
\(\sigma^2_{\mathrm{quad}}\)	`sigma2_quad`
\(\sigma^2_{\mathrm{eps}}\)	`sigma2_eps`
\(o_c\)	`offset`
\(\varepsilon_{ci}\)	`error_distr` (marginal family)
\(R\)	`error_corr` (normalized to unit diagonal; `error_chol` is its Cholesky factor)
\(\operatorname{Var}[E[Z_{ci} \mid \theta]]\)	`sigma2_mean`
\(E[\operatorname{Var}[Z_{ci} \mid \theta]]\)	`sigma2_pop`
\(\operatorname{Var}[Z_{ci}]\)	`sigma2_pri`
\(\rho\)	`het_strength`
\(\operatorname{Var}[W_{ci}^2]\)	`var_v`
\(W_{ci}\)	`DGP.error_scale`
\(W\) scalar	`het_shape='scalar'` (`DGP.error_scale` of shape `(n,)`)
\(W\) vector	`het_shape='vector'`, multivariate only (`DGP.error_scale` of shape `(k, n)`)
\(W_{ci} \equiv 1\) (\(\rho = 0\))	`het_shape=None` (the \(W\)-related attributes are `None`)
\(Z_{ci}\)	`DGP.z`
\(Y_{ci}\)	`DGP.y`
\(c\) continuous / binary	`outcome_type`
univariate (\(k = 1\), \(\lambda = 1\))	`k=None` in `gen_data` (`partition`, `beta_separate`, `A_separate` and `lambda_` are `None`)

partition: Bool[Array, 'k p'] | None¶: Predictor-outcome assignment partition of shape (k, p), used only at lambda_ < 1. Row i is the binary mask of predictors assigned to component i; rows are disjoint and each has either p // k or p // k + 1 entries. None in univariate mode (k is None).

beta_shared: Float[Array, 'p']¶: Shared linear coefficients of shape (p,), used at lambda_ > 0.

beta_separate: Float[Array, 'k p'] | None¶: Separate linear coefficients of shape (k, p), used at lambda_ < 1. Row i is supported on partition[i]. None in univariate mode (k is None).

A_shared: Float[Array, 'p p']¶: Shared quadratic coefficients of shape (p, p), used at lambda_ > 0. Nonzero on a symmetric band of q + 1 entries per row/col.

A_separate: Float[Array, 'k p p'] | None¶: Separate quadratic coefficients of shape (k, p, p), used at lambda_ < 1. Slice i is supported on the outer product of partition[i] with itself. None in univariate mode (k is None).

s: Float[Array, 'p']¶: Per-predictor importance scales of shape (p,), with E[s_j ** 2] = 1. Already folded into beta_*, A_* and gamma_*; equivalent to scaling predictor j by s_j. All ones when s_distr is Constant.

x_distr: Distr¶: Distribution family of the predictors. Families with kurtosis 1 (binary predictors) require q >= 2 because their squares are constant.

beta_distr: Distr¶: Distribution family of the linear coefficient draws b.

A_distr: Distr¶: Distribution family of the quadratic coefficient draws a.

gamma_distr: Distr¶: Distribution family of the noise projection draws g. Its kurtosis enters var_v.

error_distr: Distr¶: Marginal distribution family of the additive errors. The errors are sampled through a Gaussian copula (the error_corr dependence), so this sets each component’s marginal while leaving its variance at sigma2_eps. Normal (default) recovers jointly-Normal errors.

s_distr: ScaleDistr¶: Scale family of the importance scales s. More dispersed scales make the dependence on the predictors sparser; Constant is uniform importance (s_j = 1). ScaleDistr.from_peff parametrizes the dispersion by an effective number of active predictors.

q: Integer[Array, '']¶: Number of quadratic interactions per predictor (even, < p // k).

lambda_: Float[Array, ''] | None¶: Coupling parameter in [0, 1]. 0 is independent components, 1 is identical components. None iff univariate (partition is None), in which case only the shared path contributes to mu.

sigma2_lin: Float[Array, '']¶: Prior and expected population variance of the linear term of mu.

sigma2_quad: Float[Array, '']¶: Expected population variance of the quadratic term of mu.

sigma2_eps: Float[Array, '']¶: Variance of the additive error.

offset: Float[Array, ''] | Float[Array, 'k']¶: Constant added to the latent mean mu, shifting E[z] away from 0. Either a scalar (the same shift for every component) or a length-k vector (a per-component shift, multivariate only). Applied after the linear and quadratic terms, so for binary components it shifts the threshold and hence the success probability; defaults to 0.

sigma2_mean: Float[Array, '']¶: Variance of the expected mean function.

sigma2_pop: Float[Array, '']¶: Expected population variance of the latent z.

sigma2_pri: Float[Array, '']¶: Prior variance of the latent z.

gamma_shared: Float[Array, 'p'] | None¶: Shared coefficients of shape (p,) of the latent projection eta, drawn like beta_shared with a unit budget that cancels in the normalization. None when homoskedastic (het_shape is None).

gamma_separate: Float[Array, 'k p'] | None¶: Separate projection coefficients of shape (k, p), used only for vector heteroskedasticity (het_shape == 'vector'). None otherwise.

het_strength: Float[Array, ''] | None¶: Heteroskedasticity knob rho in [0, 1], the fraction of the (expected) noise variance carried by the heteroskedastic term. 0 is homoskedastic (error_scale == 1), 1 is maximally heterogeneous. None when homoskedastic.

var_v: Float[Array, ''] | None¶: Fully marginal variance Var[error_scale ** 2] of the noise multiplier (see Params for the closed form): a fixed scalar set by the hyperparameters, identical for every component. None when homoskedastic.

error_chol: Float[Array, 'k k'] | None¶: Lower-triangular Cholesky factor L of the across-component error correlation matrix R = L @ L.T (the error_corr argument normalized to unit diagonal). None when the errors are independent (error_corr was None), including every univariate outcome.

outcome_type: OutcomeType | tuple[OutcomeType, ...]¶: Per-component outcome type, either a single OutcomeType applied to every row, or a tuple of length k for mixed outcomes. For binary components the continuous latent mu + eps * sqrt(sigma2_eps) * error_scale is thresholded at 0, yielding 0.0/1.0 floats. Unlike the standard probit convention used by bartz.mcmcstep.init (which fixes the latent noise variance to 1), here the binary latents share the same sigma2_eps as the continuous ones, so the marginal success probability is Phi(mu / (sqrt(sigma2_eps) * error_scale)) (with error_scale 1 when homoskedastic).

het_shape: Literal['scalar', 'vector'] | None = None¶: Heteroskedasticity mode. None is homoskedastic; 'scalar' gives one error_scale per datapoint of shape (n,) scaling the whole outcome vector; 'vector' (multivariate only) gives per-component scales of shape (k, n).

bartz 0.11.0

Navigation

Related Topics

bartz.testing.Params¶