bartz.testing.Params¶
- class bartz.testing.Params(partition, beta_shared, beta_separate, A_shared, A_separate, s, x_distr, beta_distr, A_distr, gamma_distr, error_distr, s_distr, q, lambda_, sigma2_lin, sigma2_quad, sigma2_eps, offset, sigma2_mean, sigma2_pop, sigma2_pri, gamma_shared, gamma_separate, het_strength, var_v, error_chol, outcome_type, het_shape=None)[source]¶
All quantities of the data-generating process that do not depend on
n.The data follows a multivariate quadratic model whose ingredients are drawn i.i.d. from configurable families: standardized ones (mean 0, variance 1, see
Distr) for the predictors \(X\) and the coefficient draws \(b, a, g\), and a scale family (\(E[s^2] = 1\), seeScaleDistr) for the per-predictor importances \(s\). With observations \(i = 1, \ldots, n\), predictors \(j, j' = 1, \ldots, p\) and outcome components \(c = 1, \ldots, k\):\begin{align} X_{ij} &\overset{\mathrm{i.i.d.}}\sim \mathtt{x\_distr}, \quad \kappa_X = E[X_{ij}^4], \\ s_j &\overset{\mathrm{i.i.d.}}\sim \mathtt{s\_distr}, \quad \mu_4 = E[s_j^4], \\ \{S_c\}_{c=1}^k &= \text{a random partition of } \{1, \ldots, p\}, \quad \lfloor p/k \rfloor \le |S_c| \le \lceil p/k \rceil, \\ \beta^{\mathrm{sh}}_j &= s_j \sqrt{\sigma^2_{\mathrm{lin}} / p}\; b_j, \quad b_j \overset{\mathrm{i.i.d.}}\sim \mathtt{beta\_distr}, \\ \beta^{\mathrm{sep}}_{cj} &= s_j\, \mathbb 1[j \in S_c] \sqrt{\sigma^2_{\mathrm{lin}} / (p/k)}\; b'_{cj}, \quad b'_{cj} \overset{\mathrm{i.i.d.}}\sim \mathtt{beta\_distr}, \\ P^{\mathrm{sh}}_{jj'} &= \mathbb 1[\min(|j - j'|,\, p - |j - j'|) \le q / 2], \quad q \bmod 2 = 0, \quad q < p, \\ P^{\mathrm{sep}}_{cjj'} &= \text{the same circular band within each } S_c \text{ by rank}, \quad q < \lfloor p/k \rfloor, \\ A^{\mathrm{sh}}_{jj'} &= s_j s_{j'}\, P^{\mathrm{sh}}_{jj'}\, \sigma_A\, a_{jj'}, \quad a_{jj'} \overset{\mathrm{i.i.d.}}\sim \mathtt{A\_distr}, \quad \sigma^2_A = \frac{\sigma^2_{\mathrm{quad}}} {p\, ((\kappa_X - 1)\mu_4 + q)}, \\ A^{\mathrm{sep}}_{cjj'} &= s_j s_{j'}\, P^{\mathrm{sep}}_{cjj'}\, \sigma'_A\, a'_{cjj'}, \quad a'_{cjj'} \overset{\mathrm{i.i.d.}}\sim \mathtt{A\_distr}, \quad \sigma'^2_A = \frac{\sigma^2_{\mathrm{quad}}} {(p/k)\, ((\kappa_X - 1)\mu_4 + q)}, \\ \mu^{\mathrm L}_{ci} &= \sqrt\lambda \textstyle\sum_j \beta^{\mathrm{sh}}_j X_{ij} + \sqrt{1 - \lambda} \textstyle\sum_j \beta^{\mathrm{sep}}_{cj} X_{ij}, \quad \lambda \in [0, 1], \\ \mu^{\mathrm Q}_{ci} &= \sqrt\lambda \textstyle\sum_{jj'} A^{\mathrm{sh}}_{jj'} X_{ij} X_{ij'} + \sqrt{1 - \lambda} \textstyle\sum_{jj'} A^{\mathrm{sep}}_{cjj'} X_{ij} X_{ij'}, \\ \gamma^{\mathrm{sh}}_j &= s_j\, g_j / \sqrt p, \quad g_j \overset{\mathrm{i.i.d.}}\sim \mathtt{gamma\_distr}, \quad \kappa_\gamma = E[g_j^4], \\ \gamma^{\mathrm{sep}}_{cj} &= s_j\, \mathbb 1[j \in S_c]\, g'_{cj} \big/ \sqrt{p/k}, \quad g'_{cj} \overset{\mathrm{i.i.d.}}\sim \mathtt{gamma\_distr}, \\ \eta_{ci} &= \begin{cases} \textstyle\sum_j \gamma^{\mathrm{sh}}_j X_{ij} & W \text{ scalar (same for every } c\text{)}, \\ \sqrt\lambda \textstyle\sum_j \gamma^{\mathrm{sh}}_j X_{ij} + \sqrt{1 - \lambda} \textstyle\sum_j \gamma^{\mathrm{sep}}_{cj} X_{ij} & W \text{ vector}, \end{cases} \\ W_{ci}^2 &= (1 - \rho) + \rho\, \eta_{ci}^2, \quad \rho \in [0, 1], \\ \mu_{ci} &= o_c + \mu^{\mathrm L}_{ci} + \mu^{\mathrm Q}_{ci}, \quad o_c \in \mathbb R, \\ U_{\cdot i} &\overset{\mathrm{i.i.d.}}\sim N(0, R), \quad R = \operatorname{corr}(\mathtt{error\_corr}), \quad R = I \text{ by default}, \\ \varepsilon_{ci} &= F^{-1}_{\mathtt{error\_distr}}(\Phi(U_{ci})) \quad (\text{Gaussian copula; } \varepsilon_{ci} = U_{ci} \text{ for the default } N), \\ Z_{ci} &= \mu_{ci} + \sigma_{\mathrm{eps}}\, W_{ci}\, \varepsilon_{ci}, \\ Y_{ci} &= \begin{cases} Z_{ci} & c \text{ continuous}, \\ \mathbb 1[Z_{ci} > 0] & c \text{ binary}. \end{cases} \end{align}A binary component thresholds its own latent \(Z_{ci}\) at zero, so its success probability is \(F(\mu_{ci} / (\sigma_{\mathrm{eps}} W_{ci}))\), with \(F\) the (symmetric)
error_distrCDF, the Normal \(\Phi\) by default (the latent shares \(\sigma^2_{\mathrm{eps}}\) with the continuous components, unlike the unit-variance probit convention ofbartz.mcmcstep.init). Predictor families with \(\kappa_X = 1\) (binary predictors,DiscreteUniformwithm=2) have constant squares, so they require \(q \ge 2\) to keep the quadratic budget well defined. Univariate outcomes are the \(k = 1\), \(\lambda = 1\) special case with the component axis dropped, and only the scalar \(W\) is available.Writing \(\theta\) for all the sampled coefficients and \(E[\,\cdot \mid \theta]\), \(\operatorname{Var}[\,\cdot \mid \theta]\) for the population mean and variance of one dataset (over \(X\) and noise at fixed \(\theta\)), the derived expectations and variances are, for every \(\lambda\):
\begin{align} E[Z_{ci}] &= o_c, \\ \operatorname{Cov}[\mu^{\mathrm L}_{ci}, \mu^{\mathrm Q}_{ci} \mid \theta] &= \operatorname{Cov}[\mu^{\mathrm L}_{ci}, \mu^{\mathrm Q}_{ci}] = 0, \\ \operatorname{Cov}[\mu_{ci}, \mu_{c'i} \mid \theta] &= \operatorname{Cov}[\mu_{ci}, \mu_{c'i}] = 0 \quad (c \ne c',\ \lambda = 0), \\ \operatorname{Cov}[Z_{ci}, Z_{c'i} \mid \theta, X] &= \sigma^2_{\mathrm{eps}}\, W_{ci} W_{c'i}\, \operatorname{Cov}[\varepsilon_{ci}, \varepsilon_{c'i}], \quad |\operatorname{Cov}[\varepsilon_{ci}, \varepsilon_{c'i}]| \le |R_{cc'}|\ (\text{equality for } N), \\ E[\operatorname{Var}[Z_{ci} \mid \theta]] &= \sigma^2_{\mathrm{lin}} + \sigma^2_{\mathrm{quad}} + \sigma^2_{\mathrm{eps}} \quad \text{(expected population variance)}, \\ \operatorname{Var}[E[Z_{ci} \mid \theta]] &= \frac{\sigma^2_{\mathrm{quad}}\, \mu_4}{(\kappa_X - 1)\mu_4 + q} \quad \text{(variance of the expected mean)}, \\ \operatorname{Var}[Z_{ci}] &= E[\operatorname{Var}[Z_{ci} \mid \theta]] + \operatorname{Var}[E[Z_{ci} \mid \theta]] \quad \text{(prior variance)}, \\ E[W_{ci}^2] &= 1, \qquad \operatorname{Var}[W_{ci}^2] = \rho^2 (2 + e), \\ e &= \begin{cases} \dfrac{\kappa_\gamma \kappa_X \mu_4 - 3}{p} & W \text{ scalar}, \\[2ex] \dfrac{(\lambda^2 + (1 - \lambda)^2 k) (\kappa_\gamma \kappa_X \mu_4 - 3) + 6 \lambda (1 - \lambda) (\kappa_X \mu_4 - 1)}{p} \\ \quad + \dfrac{3 (1 - \lambda)^2\, r (k - r)}{p^2}, \quad r = p \bmod k & W \text{ vector}. \end{cases} \end{align}The mathematical symbols and cases map to class attributes and
gen_datasettings as follows:Symbol / case
Attribute / setting
\(X_{ij}\)
\(\kappa_X\)
x_distr.kurtosis\(s_j\)
\(\mu_4\)
s_distr.fourth_moment\(\{S_c\}\)
\(b_j, b'_{cj}\)
\(\beta^{\mathrm{sh}}\)
\(\beta^{\mathrm{sep}}\)
\(a_{jj'}, a'_{cjj'}\)
\(A^{\mathrm{sh}}\)
\(A^{\mathrm{sep}}\)
\(g_j, g'_{cj}\)
\(\kappa_\gamma\)
gamma_distr.kurtosis\(\gamma^{\mathrm{sh}}\)
\(\gamma^{\mathrm{sep}}\)
\(q\)
\(\lambda\)
\(\sigma^2_{\mathrm{lin}}\)
\(\sigma^2_{\mathrm{quad}}\)
\(\sigma^2_{\mathrm{eps}}\)
\(o_c\)
\(\varepsilon_{ci}\)
error_distr(marginal family)\(R\)
error_corr(normalized to unit diagonal;error_cholis its Cholesky factor)\(\operatorname{Var}[E[Z_{ci} \mid \theta]]\)
\(E[\operatorname{Var}[Z_{ci} \mid \theta]]\)
\(\operatorname{Var}[Z_{ci}]\)
\(\rho\)
\(\operatorname{Var}[W_{ci}^2]\)
\(W_{ci}\)
\(W\) scalar
het_shape='scalar'(DGP.error_scaleof shape(n,))\(W\) vector
het_shape='vector', multivariate only (DGP.error_scaleof shape(k, n))\(W_{ci} \equiv 1\) (\(\rho = 0\))
het_shape=None(the \(W\)-related attributes areNone)\(Z_{ci}\)
\(Y_{ci}\)
\(c\) continuous / binary
univariate (\(k = 1\), \(\lambda = 1\))
k=Noneingen_data(partition,beta_separate,A_separateandlambda_areNone)- partition: Bool[Array, 'k p'] | None¶
Predictor-outcome assignment partition of shape (k, p), used only at
lambda_ < 1. Rowiis the binary mask of predictors assigned to componenti; rows are disjoint and each has eitherp // korp // k + 1entries.Nonein univariate mode (k is None).
Shared linear coefficients of shape (p,), used at
lambda_ > 0.
- beta_separate: Float[Array, 'k p'] | None¶
Separate linear coefficients of shape (k, p), used at
lambda_ < 1. Rowiis supported onpartition[i].Nonein univariate mode (k is None).
Shared quadratic coefficients of shape (p, p), used at
lambda_ > 0. Nonzero on a symmetric band ofq + 1entries per row/col.
- A_separate: Float[Array, 'k p p'] | None¶
Separate quadratic coefficients of shape (k, p, p), used at
lambda_ < 1. Sliceiis supported on the outer product ofpartition[i]with itself.Nonein univariate mode (k is None).
- s: Float[Array, 'p']¶
Per-predictor importance scales of shape (p,), with
E[s_j ** 2] = 1. Already folded intobeta_*,A_*andgamma_*; equivalent to scaling predictorjbys_j. All ones whens_distrisConstant.
- x_distr: Distr¶
Distribution family of the predictors. Families with kurtosis 1 (binary predictors) require
q >= 2because their squares are constant.
- error_distr: Distr¶
Marginal distribution family of the additive errors. The errors are sampled through a Gaussian copula (the
error_corrdependence), so this sets each component’s marginal while leaving its variance atsigma2_eps.Normal(default) recovers jointly-Normal errors.
- s_distr: ScaleDistr¶
Scale family of the importance scales
s. More dispersed scales make the dependence on the predictors sparser;Constantis uniform importance (s_j = 1).ScaleDistr.from_peffparametrizes the dispersion by an effective number of active predictors.
- q: Integer[Array, '']¶
Number of quadratic interactions per predictor (even,
< p // k).
- lambda_: Float[Array, ''] | None¶
Coupling parameter in
[0, 1]. 0 is independent components, 1 is identical components.Noneiff univariate (partition is None), in which case only the shared path contributes tomu.
- sigma2_lin: Float[Array, '']¶
Prior and expected population variance of the linear term of
mu.
- sigma2_quad: Float[Array, '']¶
Expected population variance of the quadratic term of
mu.
- sigma2_eps: Float[Array, '']¶
Variance of the additive error.
- offset: Float[Array, ''] | Float[Array, 'k']¶
Constant added to the latent mean
mu, shiftingE[z]away from 0. Either a scalar (the same shift for every component) or a length-kvector (a per-component shift, multivariate only). Applied after the linear and quadratic terms, so for binary components it shifts the threshold and hence the success probability; defaults to 0.
- sigma2_mean: Float[Array, '']¶
Variance of the expected mean function.
- sigma2_pop: Float[Array, '']¶
Expected population variance of the latent z.
- sigma2_pri: Float[Array, '']¶
Prior variance of the latent z.
Shared coefficients of shape (p,) of the latent projection
eta, drawn likebeta_sharedwith a unit budget that cancels in the normalization.Nonewhen homoskedastic (het_shape is None).
- gamma_separate: Float[Array, 'k p'] | None¶
Separate projection coefficients of shape (k, p), used only for vector heteroskedasticity (
het_shape == 'vector').Noneotherwise.
- het_strength: Float[Array, ''] | None¶
Heteroskedasticity knob
rhoin[0, 1], the fraction of the (expected) noise variance carried by the heteroskedastic term. 0 is homoskedastic (error_scale == 1), 1 is maximally heterogeneous.Nonewhen homoskedastic.
- var_v: Float[Array, ''] | None¶
Fully marginal variance
Var[error_scale ** 2]of the noise multiplier (seeParamsfor the closed form): a fixed scalar set by the hyperparameters, identical for every component.Nonewhen homoskedastic.
- error_chol: Float[Array, 'k k'] | None¶
Lower-triangular Cholesky factor
Lof the across-component error correlation matrixR = L @ L.T(theerror_corrargument normalized to unit diagonal).Nonewhen the errors are independent (error_corrwasNone), including every univariate outcome.
- outcome_type: OutcomeType | tuple[OutcomeType, ...]¶
Per-component outcome type, either a single
OutcomeTypeapplied to every row, or a tuple of lengthkfor mixed outcomes. For binary components the continuous latentmu + eps * sqrt(sigma2_eps) * error_scaleis thresholded at 0, yielding 0.0/1.0 floats. Unlike the standard probit convention used bybartz.mcmcstep.init(which fixes the latent noise variance to 1), here the binary latents share the samesigma2_epsas the continuous ones, so the marginal success probability isPhi(mu / (sqrt(sigma2_eps) * error_scale))(witherror_scale1 when homoskedastic).