--- title: Glossary review-state: drafting last-human-review: "2026-06-11" depends-on: - src/pytyche/contracts.py - src/pytyche/compare/variants.py - src/pytyche/experiment - src/pytyche/bcf - src/pytyche/calibrate owner: tradcliffe quadrant: concept --- # Glossary Definitions for the load-bearing pytyche concepts. Several terms in this space collide easily (treatment / arm; cell / segment / cluster; CATE / HTE). Each entry below gives a short definition, the closest neighbors a reader might confuse it with, and a pointer to the code where it lives. ## Sequential experiment :::{glossary} sequential experiment The full adaptive experiment: one campaign across N rounds sharing treatments, schedule, and cumulative posterior. Constructed via `pt.sequential_experiment(...)` and iterates round by round. ```python exp = pt.sequential_experiment( generator=my_dgp, schedule=pt.GeometricSchedule(initial=10_000, growth=2.0, n_rounds=3), treatments=['control', 'low_promo', 'free_ship'], calibration=pt.Calibration.from_sweep('clustered_realistic_v1'), ) for r in exp: inspect(r) ``` Each round of a sequential experiment is an {term}`experiment`. The temporal slot housing it is a {term}`round`. The same sequential machinery powers both sim mode (generator-driven) and real-data mode (operator-driven), distinguished only by what the `generator` callable returns. Defined as `pytyche.experiment.SequentialExperiment`. experiment A single discrete experiment: observed data, analysis, cells shipped, and the recommendation for the next experiment. The shape of a traditional single-shot A/B/N test, composed of an `ObservedExperimentData` plus an `AnalysisResult`. In a {term}`sequential experiment`, each round produces one experiment. In single-shot use, `pt.analyze(observed)` returns an `AnalysisResult` directly with no sequential machinery. Defined as `pytyche.experiment.Experiment`. round One iteration of a sequential experiment — the temporal slot housing one {term}`experiment`. "Round 1, 2, 3" are positional indices into `SequentialExperiment.history`. Defined as `pytyche.experiment.Round`. schedule The protocol that decides each round's visitor count. Three shipped implementations: * `GeometricSchedule(initial, growth, n_rounds=None)` — doubling batches (matches Perchet 2016, Esfandiari 2021, Che & Namkoong 2023) * `FixedSchedule(per_round, n_rounds)` — flat batches * `ExplicitSchedule([n_round_1, n_round_2, ...])` — user-supplied per-round visitor counts A schedule's `n_rounds` is optional. When None, the schedule is open-ended and the operator decides when to stop. Defined as `pytyche.experiment.Schedule`. generator The callable supplied at construction that provides observed data when the sequential experiment advances. One entry point through which both sim mode and real-data mode deliver data. ```python Generator = Callable[ [int, NextRoundPlan], tuple[ObservedExperimentData, CalibrationTruth | None], ] ``` Sim mode supplies a DGP that returns synthesized observations alongside truth. Real-data mode supplies a callable that fetches the round's data from the operator's platform, database, or other source. The library treats both identically. Defined as `pytyche.experiment.Generator` type alias. ::: ## Treatments, arms, policies, cells Four terms with sharp distinctions. :::{glossary} treatment A named candidate intervention delivered to a single visitor (for example `'free_ship'`, `'low_promo'`, `'control'`). The thing the BCF model estimates causal effects for. Declared via the `treatments` parameter on `pt.sequential_experiment()`. Related terms: * {term}`arm` — internal-math term for the same concept. Pytyche's public API uses "treatment" canonically; "arm" still appears in BCF kernel code and per-arm `(K, S)` array dimensions. * {term}`policy` — the rule that picks which treatment to deliver per visitor. A treatment is the delivered intervention itself. arm The internal-math term for a {term}`treatment` — the integer index that encodes it within BCF kernels. The joint hurdle BCF estimator carries a per-arm axis on its sample arrays (`p_samples`/`sev_samples` of shape `(n, S_total, K)`; the `(K − 1)` contrast vector on `rpv_cate_samples`). ```python Z = np.array([0, 1, 2, 0]) # control, arm 1, arm 2, control basis = _compute_basis(Z) # bcf/preprocess.py — (n, K-1) contrast coding ``` Pytyche's public API always uses "treatment." "Arm" appears in BCF kernel code and per-arm array dimensions. Defined as integer indices in `Z` arrays. See `src/pytyche/bcf/preprocess.py` (`_compute_basis`) for the per-arm contrast coding. policy The routing rule a cell uses to decide which treatment to deliver per visitor. Four shipped variants: * `BaselinePolicy()` — always delivers the control treatment * `UniformPolicy(over=[...])` — uniform-random over a subset of treatments. The default Explore-cell policy. * `TreePolicy(tree, allocation_map)` — sklearn `DecisionTreeClassifier` plus per-leaf {term}`Thompson allocation` over treatments * Operator-defined Policy subclasses for hypothesis injection A {term}`cell` houses one policy as its routing rule. A `TreePolicy` wraps a decision tree and the per-leaf Thompson allocation. The tree alone does not make a policy. Defined as `pytyche.experiment.Policy` protocol. cell An assignment cohort within a single round of a sequential experiment. Cells span the visitor population by weight. Each cell ships a {term}`policy` that decides what treatment to deliver per visitor within that cell. ```python cells = [ Cell('control', BaselinePolicy(), weight=0.3), Cell('explore', UniformPolicy(over=treatments), weight=0.4), Cell('optimized_v1', TreePolicy(tree, allocation_map), weight=0.3), ] ``` The default round-1 structure has a Control cell and an Explore cell at 50/50. The recommendation engine may add Optimized cells in subsequent rounds. Multiple Optimized cells in one round is a first-class capability for organizations running head-to-head policy variants. Operational reasons to do this include different stakeholder ownership, vendor relationships, and channel-specific creative. A cell is an assignment-time cohort spanning the population. A {term}`segment` is a region of feature space that a policy tree within an Optimized cell partitions. A {term}`cluster` is a DGP-generated mixture component (truth-side, sim-only) and is neither. Defined as `pytyche.experiment.Cell`. Thompson allocation A Bayesian allocation rule where each treatment receives a share of traffic proportional to its posterior probability of being best. Per segment, per arm: `allocation[arm] = P(arm best | posterior)`. Each segment's allocation sums to 1. In pytyche, applied per segment with an {term}`ε-clip` floor so that no active treatment can be allocated below a minimum share. Magnitude-aware: a segment with `P(best) = 0.91` gets a markedly different allocation than `P(best) = 0.55`, without discrete regime thresholds. Defined as `pytyche.experiment.ThompsonPolicy` (the default allocation policy behind `TreePolicy` and `UniformPolicy`). ε-clip An internal safety net inside {term}`Thompson allocation`. Within each segment (leaf of an Optimized {term}`cell`'s policy tree), every active treatment receives at least `ε/K` of that segment's Optimized-cell share, where K is the active treatment count. Prevents Thompson allocation from collapsing to a single treatment per segment when one treatment dominates the posterior. The operator-facing controls-retention story is the cell-level Control and Explore weights (see {term}`min_control_weight` and {term}`min_explore_weight`), not ε. The ε-clip becomes mostly redundant when both Control and Explore cells have non-zero weight, since Explore already samples every treatment uniformly across all segments at the cell level. Not exposed at the L1 API. Lives as a Thompson-allocation implementation detail with a hard-coded default (ε = 0.02). min_control_weight The guaranteed minimum share of traffic the recommendation engine will allocate to the Control {term}`cell` when proposing the next round's structure. With `min_control_weight=0.05`, the Control cell never falls below 5% of the round's traffic regardless of how confident the model becomes. The baseline-measurement controls-retention floor. Even as the experiment matures and Optimized cells absorb more traffic, the Control cell continues to receive a guaranteed share so drift in the baseline outcome surface remains detectable. Set via the `min_control_weight` parameter on `pt.sequential_experiment()`. The operator may override the recommended weight in their own next-round plan; the floor applies only to engine-proposed allocations. min_explore_weight The guaranteed minimum share of traffic the recommendation engine will allocate to the Explore {term}`cell` when proposing the next round's structure. With `min_explore_weight=0.05`, the Explore cell never falls below 5% of the round's traffic regardless of segment confidence. The every-treatment-observed controls-retention floor. The Explore cell samples uniformly across all active treatments, so a non-zero floor guarantees every treatment receives some traffic in every segment regardless of what the Optimized cells are doing. Set via the `min_explore_weight` parameter on `pt.sequential_experiment()`. The operator may override in their own next-round plan; the floor applies only to engine-proposed allocations. ::: ## Segments, clusters, HTE :::{glossary} segment A region of feature space, typically a leaf of a policy tree. "Mobile-returning visitors" or "desktop-new visitors" are segments discovered by the segmentation pipeline (or declared by the caller via the rule algebra). The unit of {term}`Thompson allocation`: each segment receives a per-treatment allocation derived from the {term}`joint posterior`. A {term}`cell` is the routing cohort. A segment is a feature-space region, typically inside an Optimized cell's tree. The same segment may appear across multiple Optimized cells' trees with different policies attached. The collection of all discovered segments for one posterior is returned as part of a {term}`PolicyTreeResult` (one `DiscoveredSegment` per leaf, ordered by leaf id). Defined as `contracts.DiscoveredSegment` for the discovered surface and `contracts.SegmentRule` (a discriminated union of `EqRule`, `InRule`, `ComparisonRule`, `BetweenRule`) for the predicate that defines one. cluster A DGP-generated mixture component. Latent and truth-side. Clusters exist only in sim mode, populated by the generator. The `clustered_realistic` template has 4 clusters representing customer archetypes. Used for evaluation ("did our discovered segments correlate with the DGP's clusters?"), not for assignment. A {term}`segment` is an observed-side feature-space region. A cluster is truth-side mixture-component identity. They may correlate. They are not the same. Available in sim-mode `RoundData` as `cluster_ids: np.ndarray`. stability score Bootstrap-replicability score for a discovered segment: the fraction of bootstrap policy trees (B = 50 by default) in which some leaf has Jaccard overlap ≥ 0.5 with the original segment's member set. The bootstrap resamples per-visitor CATEs (not the BCF posterior itself), refits the same-depth tree on each resample, and reports the overlap fraction. Range `[0, 1]`; segments with `stability_score >= 0.80` are considered credible enough to act on. Answers the boundary-replicability question: "would this tree boundary have appeared on a slightly different sample?" Credible interval width does NOT answer this question — a tight CI says the *effect estimate* is stable given the tree, not that the *tree boundaries* themselves would survive resampling. Controlled via `posterior.fit_policy_tree(n_bootstrap=50, bootstrap_seed=...)`. Calling `n_bootstrap=0` suppresses computation and sets scores to `float("nan")`. Carried on both `DiscoveredSegment.stability_score` and {term}`PolicyTreeResult`'s `stability_scores` dict keyed by leaf id. Threshold-checked by {term}`capability methods` (`has_credible_segments(threshold=0.80)`). HTE Heterogeneous Treatment Effect: the phenomenon that the causal effect of a treatment varies across customer segments. The joint multivariate hurdle BCF estimates a per-visitor {term}`CATE` surface (via the {term}`BART forest`); the {term}`policy tree` partitions that surface into segments where CATE is approximately constant. {term}`CATE` is the technical statistical term for the per-visitor effect. HTE names the broader phenomenon. {term}`segment`s are how the discovery is partitioned for interpretability. Per-visitor CATEs live on `AnalysisResult.cate_per_visitor`. Segment-level summaries live on `AnalysisResult.segments`. CATE Conditional Average Treatment Effect: the expected difference in outcome between treatment and control for a specific covariate combination. Formally `E[Y(1) - Y(0) | X = x]`. Read per-visitor: how much would this specific visitor's outcome change under treatment vs control, given their features. The quantity {term}`BCF` estimates. The collection of CATEs across feature space is the {term}`HTE` surface. Pytyche exposes per-visitor CATEs on `AnalysisResult.cate_per_visitor` and segment-level summary CATEs on `AnalysisResult.segments`. ::: ## Models and inference :::{glossary} BART forest The tree ensemble inside the BCF — a sum of many weak trees that together approximate the underlying function. "BART" = Bayesian Additive Regression Trees. The BCF carries two such forests: a prognostic μ-forest for the baseline outcome surface and a treatment τ-forest for the conditional treatment effect. The MCMC samples a *posterior over forests* (each posterior sample is a different forest configuration), and what pytyche surfaces is the per-visitor posterior on `τ(x)` marginalized over that posterior — never the individual trees. Forest sizes are set by `GPUBCFConfig.num_trees_mu` and `GPUBCFConfig.num_trees_tau`; see `src/pytyche/bcf/config.py` for current defaults. `compute_num_trees_tau` is the formula-driven helper for picking the τ-forest size at a target CI coverage. This is NOT the {term}`policy tree`. They share the word "tree" but do different jobs: the BART forest *estimates* the CATE surface inside the BCF MCMC; the policy tree *segments* that CATE surface for downstream allocation and operator interpretability. Users don't inspect BART trees directly; they inspect the policy tree. Lives inside `pytyche.bcf.hurdle.*`. Implemented on top of `bartz` (the GPU BART primitive library). policy tree The single `sklearn.tree.DecisionTreeClassifier` fit on the per-visitor CATEs from the BCF posterior, used to discover {term}`segment`s of feature space where the treatment effect is approximately constant. One tree (not an ensemble), deterministic given the posterior + hyperparameters, user-inspectable as `PolicyTreeResult.tree`. Its leaves are the segments downstream allocation, recommendation, and graduation decisions operate on. This is NOT the {term}`BART forest`. The BART forest produces the CATEs; the policy tree partitions them. The policy tree is what shows up in cell recommendations ({term}`Thompson allocation`'s `allocation_map[leaf_id]`) and what operators see as the segmentation of "where the lift comes from." Depth controlled by `max_segment_depth` at the L1 surface (`pt.sequential_experiment(max_segment_depth=3)`) or `max_depth` at the L2 method (`posterior.fit_policy_tree(max_depth=3)`). Minimum segment size controlled by `min_segment_share` (default 0.10). The result is a {term}`PolicyTreeResult` — a frozen dataclass carrying the tree, segments, allocation map, and bootstrap stability scores. BCF Bayesian Causal Forests. The class of model pytyche builds on for HTE estimation, introduced by Hahn, Murray, and Carvalho (2020). Combines a "prognostic" forest (estimating the baseline outcome surface) with a "treatment" forest (estimating the conditional treatment effect) to give unbiased per-visitor CATE estimates that don't confound effect modification with prognostic signal. Both forests are {term}`BART forest`s. For zero-inflated outcomes like e-commerce revenue per visitor, pytyche uses a {term}`joint hurdle BCF` that shares tree topology between the conversion (probit) and severity (log-normal) channels. Defined as `pytyche.bcf.fit_continuous_bcf`, `pytyche.bcf.fit_binary_bcf`, and friends. The high-level sequential surface (`pt.sequential_experiment`) calls these under the covers. joint hurdle BCF The model pytyche actually fits for e-commerce revenue and similar zero-inflated outcomes. Two channels — a conversion probit channel and a severity log-normal channel — share tree topology, so the per-visitor CATE on revenue decomposes cleanly into "did the treatment change conversion" and "did it change basket size given conversion." For multi-arm experiments, the joint hurdle BCF estimates per-treatment effects jointly via shared prognostic structure rather than fitting K-1 independent contrasts (which leaks power on max-of-K selection). Defined as `pytyche.bcf.fit_hurdle_bcf` (called by `pt.sequential_experiment` and `pt.analyze`). The {term}`pooling` kwarg selects between the canonical shared-tree fit and the independent two-stage literature baseline. hurdle outcomes Outcomes with two distinct components: a binary "did anything happen" gate and a positive-valued severity conditional on the gate firing. E-commerce revenue per visitor is the canonical example — most visitors convert at $0; the converting tail has continuously-distributed positive revenue. Standard regression on a hurdle outcome confounds "treatment changes conversion probability" with "treatment changes basket size." Hurdle BCF models the two channels separately, then combines them for the per-visitor revenue effect. pooling The mode in which `fit_hurdle_bcf` (and {term}`pt.fit` when it dispatches to the hurdle path) couples the two channels of the {term}`joint hurdle BCF`. Two values: * `"joint"` (default) — canonical shared-tree fit. Conversion (probit) and severity (log-normal) share tree topology, so the per-visitor CATE decomposes cleanly and the model borrows strength across the two channels. This is the v0.2+ recommended path for typical e-commerce revenue data. * `"independent"` — independent two-stage fit. Runs `fit_binary_bcf` for conversion, then `fit_continuous_bcf` for log-severity on converters, and composes the posteriors. Opt in when the two channels are driven by different feature subsets, when one channel has dominant HTE and shared topology distorts the other, or when a researcher wants per-channel HTE structure without the regularization-induced coupling. Exposed as `fit_hurdle_bcf(..., pooling="joint")` at the fit boundary and carried on `HurdleBCFResult.pooling`. Passed through verbatim when calling `pt.fit(observed, pooling="independent")`. Stored on `Calibration` regime metadata — a calibration artifact fitted on joint-pooling data is not applied to an independent-pooling posterior. Defined as a `Literal["joint", "independent"]` kwarg on `pytyche.bcf.fit_hurdle_bcf`. Private dispatch helpers `_fit_joint_hurdle_bcf` (in `pytyche.bcf.hurdle.model`) and `_fit_independent_hurdle_bcf` (in `pytyche.bcf.hurdle.compose`) implement the two paths. joint posterior The Bayesian posterior distribution over all model parameters considered jointly. In pytyche's joint hurdle BCF, the joint posterior covers per-treatment conversion probabilities, per-treatment severity means, and the per-visitor treatment effects, all conditioned on the observed data. "Joint" because the parameters are estimated together with their full correlation structure preserved, rather than fitted marginally and assumed independent. This is what lets Thompson allocation respect cross-treatment dependence. Direct access on `AnalysisResult.posterior` for follow-up analysis (custom decompositions, alternative ship rules, sensitivity checks). ::: ## Results, recommendations, graduation :::{glossary} PolicyTreeResult The frozen dataclass returned by `posterior.fit_policy_tree(...)`. Bundles the policy tree and all downstream-usable derived data: * `tree` — the fitted `sklearn.tree.DecisionTreeClassifier` partitioning feature space into {term}`segment`s * `segments` — one `DiscoveredSegment` per leaf, ordered by leaf id; carries `rule`, `gate_estimate`, `gate_ci`, `stability_score`, `population_share`, `id`, and `arm_best_probabilities` * `allocation_map` — `dict[leaf_id, dict[treatment_name, weight]]`; each leaf's weight dict sums to 1.0; produced by {term}`Thompson allocation` under the shared best-arm rule * `stability_scores` — `dict[leaf_id, float]` in `[0, 1]`; bootstrap-replicability scores computed by resampling visitor CATEs and Jaccard-overlap matching (see {term}`stability score`) * `observed` — reference to the `ObservedExperimentData` the underlying posterior was fit on; shared by identity from the posterior (no re-clone; see {term}`observed data stashing`) The dataclass is frozen; assignment to any field raises `dataclasses.FrozenInstanceError`. `tree` is a `sklearn.tree.DecisionTreeClassifier` for v0.2; a future change may introduce a pytyche wrapper with the same predict / decision path methods. `PolicyTreeResult` is NOT the {term}`BART forest`. The BART forest estimates the CATE surface inside the MCMC; the policy tree in `PolicyTreeResult` partitions that surface for downstream allocation and operator interpretability. Defined as `pytyche.contracts.PolicyTreeResult`. decision The recommended ship-or-continue-or-stop call for a treatment versus baseline. A 3-value enum: `SHIP`, `CONTINUE`, `STOP`. Defined as `contracts.Decision`. recommendation summary A structured decision (`SHIP`, `CONTINUE`, or `STOP`) with supporting evidence: expected losses, probability of positive lift, probability of meaningful improvement, probability of harm. The decision applies five thresholds across three branches. * **SHIP gate** — `expected_loss < tolerance` AND `p_positive > 0.95` AND `p_better > 0.80` * **STOP (harm)** — `p_harmful > 0.90` * **STOP (futility)** — `p_better < 0.05` * **CONTINUE** — default when no gate fires Produced by `recommendation_summary()` in `compare.variants`. A {term}`graduation candidate` is a (treatment, segment) pair whose recommendation summary has fired SHIP for N consecutive rounds. Defined as `contracts.RecommendationSummary`. graduation candidate A (treatment, segment) pair where the recommendation has fired SHIP for ≥ `sustained_rounds` consecutive rounds. The default rule fires when `expected_loss < tolerance` AND `p_positive > 0.95` AND `p_better > 0.80`, sustained over at least 2 rounds. Segments with `stability_score < 0.80` are excluded from graduation-candidate consideration by default (see {term}`stability score`). The {term}`capability methods` `has_credible_segments(threshold=0.80)` provides a quick check before running the full analysis. Pytyche surfaces graduation candidates as structured data. The operator (or an agentic caller) decides whether to promote one to broader rollout. The library does not auto-graduate. Defined as `pytyche.experiment.GraduationCandidate`. next round plan The recommended cell structure, treatments list, and prose summary for the next round of a sequential experiment. The handoff between the recommendation engine and the operator's next-round decision. Carries: * Recommended {term}`cell`s (typically Control + Explore + an Optimized cell with the recommended tree) * Active treatments * Dropped treatments, if any * {term}`graduation candidate`s * Prose rationale The operator may accept, partially override (for example, add a hypothesis cell), or fully replace before shipping. Defined as `pytyche.experiment.NextRoundPlan`. ::: ## L2 analysis surface :::{glossary} pt.fit The auto-selecting fit entry point at the top-level `pytyche` namespace. Inspects the extracted outcome array `Y` and treatment cardinality `K = len(observed.variants)`, then dispatches deterministically to one of three underlying fit functions: * `Y` all `{0, 1}` → `fit_binary_bcf` → returns `BinaryBCFResult` * `Y` float dtype, < 30% zero entries → `fit_continuous_bcf` → returns `ContinuousBCFResult` * `Y` float dtype, ≥ 30% zero entries with a positive non-zero tail → `fit_hurdle_bcf(..., pooling="joint")` → returns `HurdleBCFResult` (at any K, including K ≥ 3) ```python posterior = pt.fit(observed) # auto-selects fit posterior = pt.fit(observed, pooling="independent") # kwarg forwarded verbatim ``` The same `observed` always selects the same fit function (deterministic). `**kwargs` forward verbatim to the dispatched fit. Users who need explicit control call `pt.fit_binary_bcf`, `pt.fit_continuous_bcf`, or `pt.fit_hurdle_bcf` directly. Edge cases: all-zero `Y` raises `ValueError` naming both binary and hurdle interpretations; multi-arm `Z` with binary or continuous `Y` raises `NotImplementedError` (multi-arm binary / continuous BCF is not yet shipped). The 30% zero-density threshold is a semi-empirical starting point for e-commerce revenue distributions — generous enough to catch typical revenue data, conservative enough to keep non-hurdle continuous data on the continuous path. Defined at `pytyche.fit`. Internal dispatch helper at `pytyche.bcf.dispatch._dispatch_fit` (or similar). observed data stashing The contract that every posterior result type (`HurdleBCFResult`, `ContinuousBCFResult`, `BinaryBCFResult`) carries the `ObservedExperimentData` it was fit on. Analysis methods reach their inputs through `posterior.observed`, not through a separately-passed handle, so fit-time and analysis-time data encodings cannot drift. ```python posterior = pt.fit(observed) # downstream methods reach the data through the posterior tree = posterior.fit_policy_tree() # derived results share the same observed by identity assert tree.observed is posterior.observed ``` Derived results (`PolicyTreeResult`, calibrated posteriors) hold the same reference by identity — the cost of stashing is paid once at fit time, not per-derivation. The {term}`observed_copy parameter` controls what kind of stash is created. Follows the sklearn idiom (`.X_train_` and similar) — downstream operations on a fitted result reach into the input data through the result. observed_copy parameter The kwarg on every fit entry point (`pt.fit`, `pt.fit_hurdle_bcf`, etc.) that controls how the input `ObservedExperimentData` is stashed on the resulting posterior. Three modes: * `"view"` (default) — shallow clone of the dataclass with each visitors DataFrame rebuilt over read-only numpy views of the original columns. Zero data-buffer copy; in-place mutation through the stash raises `ValueError` ("assignment destination is read-only"). Buffers are still shared with the caller's original handle, so mutation through the original is reflected in the stash. * `"deep"` — `copy.deepcopy(observed)` at fit time. Doubles memory for input data; provides a bit-stable stash independent of any subsequent mutation to the original handle. * `"ref"` — `posterior.observed is observed` directly. No view wrappers; no protection. For the power-user case that wants the cheapest possible path and accepts mutation risk on both sides. Any other value raises `ValueError` naming the three valid modes. See {term}`observed data stashing` for the propagation contract through derived results. capability methods A pair of pure getters on every posterior result type that enable conditional downstream logic without triggering heavyweight computation: * `has_credible_segments(threshold=0.80) -> bool` — `True` iff at least one segment in `posterior.analyze().segments` has `stability_score >= threshold`. The default threshold matches the `ExpectedLossRule` SHIP-gate stability floor. * `has_decomposition() -> bool` — `True` for `HurdleBCFResult` (the two-channel hurdle decomposition into conversion + severity); `False` for `ContinuousBCFResult` and `BinaryBCFResult`. Both are pure: no state mutation, no side effects, deterministic given the posterior. The canonical branch pattern: ```python if posterior.has_credible_segments(): tree = posterior.fit_policy_tree() # ship tree-based policy elif posterior.has_decomposition(): # hurdle posterior with no credible segments yet — inspect # channel decomposition to diagnose ... ``` Defined on each result type in `pytyche.bcf`. The threshold default (0.80) is shared with {term}`stability score`'s credibility cutoff. pt.viz namespace The `pytyche.viz` submodule exposing five matplotlib-backed visualization primitives: * `pt.viz.plot_cells(cells, ax=None)` — horizontal bar chart of cell weights for one round * `pt.viz.plot_policy_tree(tree_policy, ax=None)` — tree diagram from a {term}`PolicyTreeResult` * `pt.viz.plot_segment_intervals(segments, ax=None)` — forest plot of per-segment gate estimates + 80% credible intervals * `pt.viz.plot_calibration(calibrated_posterior, reference=None, ax=None)` — R(p) calibration curve, optionally overlaid with a reference (uncalibrated) curve * `pt.viz.experiment_evolution_gif(history, output_path, fps=1)` — animated GIF rendering round-by-round cell structure and policy tree evolution Each static primitive accepts an `ax` parameter for matplotlib subplot composition. When `ax=None`, a new figure and axes are created and returned. The GIF helper renders to disk and returns the path. `matplotlib` is imported lazily — `import pytyche` does NOT trigger the matplotlib import. The cost is paid only when a `pt.viz.*` function is first called. `matplotlib >= 3.7` and either `imageio >= 2.31` or `Pillow >= 10.0` are base dependencies in `pyproject.toml`. Importable as `import pytyche.viz as ptviz` or `from pytyche import viz`. The five names are also importable directly: `from pytyche.viz import plot_cells, ...`. Defined in `pytyche.viz`. ::: ## Calibration, truth, sim mode :::{glossary} calibration On-path recalibration that corrects BCF posterior coverage at scale. Three construction paths: ```python pt.Calibration.from_sweep('clustered_realistic_v1') # shipped artifact pt.Calibration.from_sweep('/path/to/sweep.json') # user-fitted pt.Calibration.skip() # uncorrected ``` When calibration is specified, the library applies the correction automatically. When `skip()` is used, uncalibrated posteriors are explicitly labeled in result objects and the library emits a warning on first fit. A `Calibration` instance is a frozen dataclass carrying: * `correction` — the `LayeredCalibrationCorrection` (layered R(p) + scale-family correction payload) * Regime metadata: `metric` (the outcome metric the sweep was fitted on, e.g. `"revenue_per_visitor"`), `n_treatments` (the K of the fitted sweep), `pooling` (the {term}`pooling` mode of the sweep — `"joint"` or `"independent"`) * `applies_to(observed: ObservedExperimentData) -> bool` — `True` iff `observed.metric`, `len(observed.variants)`, and the {term}`pooling` mode all match the artifact's regime metadata. `posterior.apply_calibration(calibration)` raises `ValueError` naming the mismatched dimension(s) when `applies_to` returns `False`. This prevents silently applying a K=2 revenue correction to a K=3 conversion-rate posterior. `from_sweep` / `skip` constructors and the shipped artifact registry land with `sequential-experiment-api`. The type's minimal contract (frozen dataclass + regime metadata + `applies_to`) is owned by this L2 surface. Canonical home: `pytyche.calibrate.Calibration`, re-exported as `pt.Calibration`. Calibration machinery lives in `src/pytyche/calibrate/`. calibration truth Ground truth for a single calibration or simulation run: per-visitor CATE, hurdle decomposition (p0, p1, m0, m1), and effect components. Lives only in the sim and calibration paths; analysis code cannot peek at it because the type is segregated via `CalibrationBundle`. Defined as `contracts.CalibrationTruth`. truth comparison Per-round truth-vs-estimate metrics, populated only in sim mode. `None` when the experiment runs in real-data mode. Six fields: * `cate_rmse` — root mean square error of estimated CATE against truth * `policy_accuracy` — fraction of visitors for whom the recommended treatment matches the truth-optimal treatment * `oracle_gap_rpv` — RPV regret of the recommended policy vs the oracle policy * `rpv_policy`, `rpv_uniform`, `rpv_oracle` — RPV under the recommended policy, uniform random allocation, and the oracle policy respectively Defined as `pytyche.experiment.TruthComparison`. SBC (in pytyche) In this codebase "SBC" is used loosely for **simulation-based coverage evaluation and correction** — generate data from known ground truth, measure how the posterior's credible intervals (and decisions) actually perform, and fit a correction. It is **not** the classical rank-statistic Simulation-Based Calibration of Talts et al. (2018), which checks posterior-rank uniformity. Two modules carry the "SBC" label and neither implements that rank procedure: * `pytyche.calibrate.sbc` — oracle-decision and regret evaluation against planted truth (does the recommended decision match the oracle; what regret does it incur). * `scripts/fit_sbc_correction.py` — fits the isotonic R(p) coverage correction (nominal → empirical coverage mapping). Read "SBC" here as the umbrella for that simulate-then-correct workflow, not as the Talts diagnostic. ::: ## Setup and environment :::{glossary} setup report The structured output of `pt.check_setup()`. Carries the pytyche version, JAX device list, CUDA availability, bartz version, calibration registry state, and a recommended install command when GPU is absent. ```python report = pt.check_setup() if not report.cuda_available: print(report.recommended_install) ``` Defined as `pytyche.SetupReport`. ::: ## Contract types — quick reference The contract types this glossary references and where they live: | Term | Type | Module | | --- | --- | --- | | Decision (enum) | `Decision` | `contracts` | | Observed experiment data | `ObservedExperimentData` | `contracts` | | Variant data | `VariantData` | `contracts` | | Visitor schema | `VISITOR_SCHEMA` | `contracts` | | Segment rule | `SegmentRule` (union: `EqRule`, `InRule`, `ComparisonRule`, `BetweenRule`) | `contracts` | | Discovered segment | `DiscoveredSegment` | `contracts` | | Aligned visitor array | `AlignedVisitorArray` | `contracts` | | Decomposition samples | `DecompositionSamples` | `contracts` | | Comparison result | `ComparisonResult` | `contracts` | | Recommendation summary (type) | `RecommendationSummary` | `contracts` | | Recommendation summary (function) | `recommendation_summary()` | `compare.variants` | | Decision thresholds | `DecisionThresholds` | `compare.variants` | | Analysis result | `AnalysisResult` | `contracts` | | Policy tree result | `PolicyTreeResult` | `contracts` | | Calibration | `Calibration` | `calibrate` | | Layered calibration correction | `LayeredCalibrationCorrection` | `calibrate.layered` | | Calibration truth | `CalibrationTruth` | `contracts` | | Calibration bundle | `CalibrationBundle` | `contracts` | | Calibration record | `CalibrationRecord` | `contracts` | | Compare variants | `compare_variants()` | `compare.variants` | | Claim level | `ClaimLevel` (enum) | `contracts` | | Metric family | `MetricFamily` (enum) | `contracts` |