bartz.mcmcstep.AutoOneHotReduction

class bartz.mcmcstep.AutoOneHotReduction(min_matmul_bins=8)[source]

OneHotReduction that picks method and n_inner automatically.

Resolves both knobs from trace-time information per site and platform, then delegates to a plain OneHotReduction. Uses matmul only for wide-bin multivariate reductions and multiply otherwise; lays the datapoints on the outer axis except on the two small-bin sites where the opposite wins (cpu precision, cuda count). Those two sites support only cpu and cuda, raising at lowering elsewhere.

The site is recovered from the value: scalar is the count, a wide output the residual, a narrow non-scalar output the precision.

Known limitation: the wide-bin univariate residual on cpu past ~10^6 datapoints prefers a layout this picks against (up to ~2x slower).

min_matmul_bins: int = 8

Minimum output bins for matmul; below it multiply is always used.