--- title: "Decision-theoretic inputs" review-state: drafting last-human-review: "2026-06-11" depends-on: - src/pytyche/contracts.py - src/pytyche/experiment/recommendation.py - src/pytyche/analysis/_recommendation.py - src/pytyche/analysis/_thompson.py owner: unowned quadrant: concept --- # Decision-theoretic inputs > :::{note} > All four inputs below are available today. Worked examples of custom > `GraduationRule` implementations and guidance on calibrating loss > thresholds are planned additions to this page (see the list at the > end). > ::: Pytyche's stance: **the library surfaces decision-theoretic inputs; the operator (or the operator's policy) makes the decision.** Most experimentation platforms ship a verdict (ship / don't ship) without exposing how the model arrived at it. Pytyche exposes the inputs and lets the operator wire them into whatever rule fits their business. ## The inputs The public surface carries four decision-theoretic quantities across two types: **On `RecommendationSummary`:** - `expected_loss_baseline: float` — per-visitor regret of choosing the baseline arm when the comparison would have been better. Integral over the posterior on `τ = E[Y | comparison] − E[Y | baseline]` of `max(τ, 0)`. Outcome units (dollars per visitor for revenue outcomes). - `expected_loss_comparison: float` — per-visitor regret of choosing the comparison arm when the baseline would have been better. `E[max(-τ, 0)]` over the posterior. Same units as the baseline variant. - `expected_value_of_one_more_round: float` — the expected reduction in decision loss from being able to re-decide after one more round of data at the same per-round n. Same units as the expected losses (loss-reduction per visitor). **The computation** is the closed-form preposterior expected value of sample information (EVSI; Raiffa–Schlaifer two-action form) on a normal approximation of the lift posterior. From the scope's per-draw mean contrast samples: `μ = mean`, `σ = sd`. One more round at the same per-round n doubles the data behind the posterior, so the *preposterior* standard deviation of the future posterior mean is `s = σ/√2`, and ``` EVOR = s · (φ(z) − z · Φ(−z)), z = |μ| / s ``` where `φ`/`Φ` are the standard normal pdf/cdf (the unit normal loss integral). Degenerate `σ = 0` gives exactly `0.0`. **When to trust it:** the normal approximation is CLT-justified because the lift is a mean over visitors; it is least reliable at very small n or for segment scopes with few members. Properties to lean on: it is strictly below the chosen side's expected loss mid-experiment (the next round reduces risk, never erases it), and it falls to ~0 once the decision is unambiguous — "more data will not change the call" is readable directly from this number. Monte-Carlo simulation over posterior-predictive next rounds was considered and rejected: it requires refits per simulated round, is noisy at practical budgets, and the closed form already has the correct limits. **On `DiscoveredSegment`:** - `arm_best_probabilities: dict[str, float]` — per-arm posterior probability that this arm is best in this segment, under the shared best-arm rule (control wins a draw exactly when every contrast is non-positive). Keyed by ALL variant names *including the control*; values sum to 1.0. Computed as per-draw win frequencies of the segment-mean contrast vector — the same computation `thompson_allocation` floors and returns as allocation weights. Per-segment companion to the global expected-loss contrast. ## The default graduation rule Pytyche ships `ExpectedLossRule` as the default graduation rule. The rule fires for a (treatment, segment) pair when ALL of: - `expected_loss_comparison < expected_loss_max` (operator-set tolerance, outcome units) - `probability_positive > p_positive_threshold` (operator-set per-round threshold) - `probability_better > p_better_threshold` (operator-set per-round threshold) AND the per-round condition has held across `sustained_rounds` consecutive rounds in the experiment's history. `expected_value_of_one_more_round` and `arm_best_probabilities` are NOT inputs to the default rule. They are surfaced as fields for operator-defined rules that want to consume them: pass a custom `GraduationRule` implementation to `pt.sequential_experiment(graduation_rule=...)` to act on the extended inputs. ## Planned additions to this page - Worked examples of custom `GraduationRule` implementations that consume the extended inputs (cost-aware stopping, per-segment ship-with-caveat, drop-treatment policies) - Calibration of `ε_loss` thresholds against domain context - The cross-references to `pytyche.diagnostics` for sample-count and convergence sanity checks before trusting any of these inputs ## Cross-references - [Sequential targeting](sequential-targeting.md) — why Thompson allocation + per-round refits is the substrate. - [Statistical honesty](statistical-honesty.md) — why surfacing decision-theoretic inputs is the honest alternative to shipping a verdict. - API reference: `pytyche.contracts.RecommendationSummary`, `pytyche.contracts.DiscoveredSegment`, `pytyche.ExpectedLossRule`.