pytyche.summarize¶
Empirical (non-Bayesian) summarization of observed experiment data.
Pure functions computing counts, rates, and lifts from
ObservedExperimentData. No posterior inference — just arithmetic
on the observed data. Serves as the first consumer of v2 contracts.
Format parity invariant: summarize_v2 accepts both generator output
and production-loaded data identically, since both produce
ObservedExperimentData conforming to VISITOR_SCHEMA.
Functions
|
Boolean mask: True for visitors matching ALL clauses (AND-combined). |
|
Empirical hurdle decomposition in RPV units. |
|
Compute empirical summary from observed experiment data. |
Classes
|
Complete empirical summary of an experiment. |
|
Lift between two variants for a single metric. |
|
Per-segment breakdown with variant stats and lift. |
|
Per-variant empirical summary. |
- pytyche.summarize.summarize_hurdle_components(observed)[source]¶
Empirical hurdle decomposition in RPV units.
Raises ValueError if metric is not revenue_per_visitor or if not exactly 2 arms.
Uses the same additive decomposition as CalibrationTruth:
conv_effect = (p1_hat - p0_hat) * m0_hat aov_effect = p1_hat * (m1_hat - m0_hat) total = conv_effect + aov_effect
Where p_hat = conversion rate, m_hat = mean AOV among converters. Guards: m_hat = 0.0 if no converters in that arm.
- Parameters:
observed (
ObservedExperimentData)- Return type:
dict[str,float]
- class pytyche.summarize.VariantSummary(name, n_visitors, n_conversions, conversion_rate, total_revenue, revenue_per_visitor)[source]¶
Bases:
objectPer-variant empirical summary.
- Parameters:
name (
str)n_visitors (
int)n_conversions (
int)conversion_rate (
float)total_revenue (
float)revenue_per_visitor (
float)
- class pytyche.summarize.LiftSummary(baseline, comparison, metric, baseline_value, comparison_value, absolute_lift, relative_lift)[source]¶
Bases:
objectLift between two variants for a single metric.
- Parameters:
baseline (
str)comparison (
str)metric (
str)baseline_value (
float)comparison_value (
float)absolute_lift (
float)relative_lift (
float|None)
- class pytyche.summarize.SegmentSummary(rule, n_visitors, pct_of_total, variants, lift)[source]¶
Bases:
objectPer-segment breakdown with variant stats and lift.
- Parameters:
rule (
SegmentRule)n_visitors (
int)pct_of_total (
float)variants (
list[VariantSummary])lift (
LiftSummary)
- class pytyche.summarize.EmpiricalSummary(experiment_id, metric, variants, lift, segments)[source]¶
Bases:
objectComplete empirical summary of an experiment.
- Parameters:
experiment_id (
str)metric (
str)variants (
list[VariantSummary])lift (
LiftSummary)segments (
list[SegmentSummary])
- pytyche.summarize.apply_rule(df, rule)[source]¶
Boolean mask: True for visitors matching ALL clauses (AND-combined).
NaN values in feature columns produce False — a visitor with missing data does not match any rule.
- Parameters:
df (
DataFrame)rule (
SegmentRule)
- Return type:
Series
- pytyche.summarize.summarize_v2(observed, segments=None, *, strict=True)[source]¶
Compute empirical summary from observed experiment data.
Validates
observedat entry (fail-closed).- Parameters:
observed (
ObservedExperimentData) – The experiment data to summarize.segments (
list[SegmentRule] |None) – Optional list of segment rules for breakdown. Each rule produces aSegmentSummary.strict (
bool) – Passed through tovalidate_observed_data. SetFalseto allow asymmetric feature columns across variants.
- Returns:
Summary with per-variant stats, lift, and optional segment breakdown.
- Return type:
- Raises:
ValueError – If the experiment does not have exactly 2 variants.
SchemaViolation – If observed data fails validation.