pytyche.summarize¶

Empirical (non-Bayesian) summarization of observed experiment data.

Pure functions computing counts, rates, and lifts from ObservedExperimentData. No posterior inference — just arithmetic on the observed data. Serves as the first consumer of v2 contracts.

Format parity invariant: summarize_v2 accepts both generator output and production-loaded data identically, since both produce ObservedExperimentData conforming to VISITOR_SCHEMA.

Functions

`apply_rule`(df, rule)	Boolean mask: True for visitors matching ALL clauses (AND-combined).
`summarize_hurdle_components`(observed)	Empirical hurdle decomposition in RPV units.
`summarize_v2`(observed[, segments, strict])	Compute empirical summary from observed experiment data.

Classes

`EmpiricalSummary`(experiment_id, metric, ...)	Complete empirical summary of an experiment.
`LiftSummary`(baseline, comparison, metric, ...)	Lift between two variants for a single metric.
`SegmentSummary`(rule, n_visitors, ...)	Per-segment breakdown with variant stats and lift.
`VariantSummary`(name, n_visitors, ...)	Per-variant empirical summary.

pytyche.summarize.summarize_hurdle_components(observed)[source]¶

Empirical hurdle decomposition in RPV units.

Raises ValueError if metric is not revenue_per_visitor or if not exactly 2 arms.

Uses the same additive decomposition as CalibrationTruth:

conv_effect = (p1_hat - p0_hat) * m0_hat
aov_effect  = p1_hat * (m1_hat - m0_hat)
total       = conv_effect + aov_effect

Where p_hat = conversion rate, m_hat = mean AOV among converters. Guards: m_hat = 0.0 if no converters in that arm.

Parameters:: observed (ObservedExperimentData)
Return type:: dict[str, float]

class pytyche.summarize.VariantSummary(name, n_visitors, n_conversions, conversion_rate, total_revenue, revenue_per_visitor)[source]¶

Bases: object

Per-variant empirical summary.

Parameters:

name (str)
n_visitors (int)
n_conversions (int)
conversion_rate (float)
total_revenue (float)
revenue_per_visitor (float)

class pytyche.summarize.LiftSummary(baseline, comparison, metric, baseline_value, comparison_value, absolute_lift, relative_lift)[source]¶

Bases: object

Lift between two variants for a single metric.

Parameters:

baseline (str)
comparison (str)
metric (str)
baseline_value (float)
comparison_value (float)
absolute_lift (float)
relative_lift (float | None)

class pytyche.summarize.SegmentSummary(rule, n_visitors, pct_of_total, variants, lift)[source]¶

Bases: object

Per-segment breakdown with variant stats and lift.

Parameters:

rule (SegmentRule)
n_visitors (int)
pct_of_total (float)
variants (list[VariantSummary])
lift (LiftSummary)

class pytyche.summarize.EmpiricalSummary(experiment_id, metric, variants, lift, segments)[source]¶

Bases: object

Complete empirical summary of an experiment.

Parameters:

experiment_id (str)
metric (str)
variants (list[VariantSummary])
lift (LiftSummary)
segments (list[SegmentSummary])

pytyche.summarize.apply_rule(df, rule)[source]¶

Boolean mask: True for visitors matching ALL clauses (AND-combined).

NaN values in feature columns produce False — a visitor with missing data does not match any rule.

Parameters:

df (DataFrame)
rule (SegmentRule)

Return type:

Series

pytyche.summarize.summarize_v2(observed, segments=None, *, strict=True)[source]¶

Compute empirical summary from observed experiment data.

Validates observed at entry (fail-closed).

Parameters:

observed (ObservedExperimentData) – The experiment data to summarize.
segments (list[SegmentRule] | None) – Optional list of segment rules for breakdown. Each rule produces a SegmentSummary.
strict (bool) – Passed through to validate_observed_data. Set False to allow asymmetric feature columns across variants.

Returns:

Summary with per-variant stats, lift, and optional segment breakdown.

Return type:

EmpiricalSummary

Raises:

ValueError – If the experiment does not have exactly 2 variants.
SchemaViolation – If observed data fails validation.