pytyche.summarize

Empirical (non-Bayesian) summarization of observed experiment data.

Pure functions computing counts, rates, and lifts from ObservedExperimentData. No posterior inference — just arithmetic on the observed data. Serves as the first consumer of v2 contracts.

Format parity invariant: summarize_v2 accepts both generator output and production-loaded data identically, since both produce ObservedExperimentData conforming to VISITOR_SCHEMA.

Functions

apply_rule(df, rule)

Boolean mask: True for visitors matching ALL clauses (AND-combined).

summarize_hurdle_components(observed)

Empirical hurdle decomposition in RPV units.

summarize_v2(observed[, segments, strict])

Compute empirical summary from observed experiment data.

Classes

EmpiricalSummary(experiment_id, metric, ...)

Complete empirical summary of an experiment.

LiftSummary(baseline, comparison, metric, ...)

Lift between two variants for a single metric.

SegmentSummary(rule, n_visitors, ...)

Per-segment breakdown with variant stats and lift.

VariantSummary(name, n_visitors, ...)

Per-variant empirical summary.

pytyche.summarize.summarize_hurdle_components(observed)[source]

Empirical hurdle decomposition in RPV units.

Raises ValueError if metric is not revenue_per_visitor or if not exactly 2 arms.

Uses the same additive decomposition as CalibrationTruth:

conv_effect = (p1_hat - p0_hat) * m0_hat
aov_effect  = p1_hat * (m1_hat - m0_hat)
total       = conv_effect + aov_effect

Where p_hat = conversion rate, m_hat = mean AOV among converters. Guards: m_hat = 0.0 if no converters in that arm.

Parameters:

observed (ObservedExperimentData)

Return type:

dict[str, float]

class pytyche.summarize.VariantSummary(name, n_visitors, n_conversions, conversion_rate, total_revenue, revenue_per_visitor)[source]

Bases: object

Per-variant empirical summary.

Parameters:
  • name (str)

  • n_visitors (int)

  • n_conversions (int)

  • conversion_rate (float)

  • total_revenue (float)

  • revenue_per_visitor (float)

class pytyche.summarize.LiftSummary(baseline, comparison, metric, baseline_value, comparison_value, absolute_lift, relative_lift)[source]

Bases: object

Lift between two variants for a single metric.

Parameters:
  • baseline (str)

  • comparison (str)

  • metric (str)

  • baseline_value (float)

  • comparison_value (float)

  • absolute_lift (float)

  • relative_lift (float | None)

class pytyche.summarize.SegmentSummary(rule, n_visitors, pct_of_total, variants, lift)[source]

Bases: object

Per-segment breakdown with variant stats and lift.

Parameters:
class pytyche.summarize.EmpiricalSummary(experiment_id, metric, variants, lift, segments)[source]

Bases: object

Complete empirical summary of an experiment.

Parameters:
pytyche.summarize.apply_rule(df, rule)[source]

Boolean mask: True for visitors matching ALL clauses (AND-combined).

NaN values in feature columns produce False — a visitor with missing data does not match any rule.

Parameters:
Return type:

Series

pytyche.summarize.summarize_v2(observed, segments=None, *, strict=True)[source]

Compute empirical summary from observed experiment data.

Validates observed at entry (fail-closed).

Parameters:
  • observed (ObservedExperimentData) – The experiment data to summarize.

  • segments (list[SegmentRule] | None) – Optional list of segment rules for breakdown. Each rule produces a SegmentSummary.

  • strict (bool) – Passed through to validate_observed_data. Set False to allow asymmetric feature columns across variants.

Returns:

Summary with per-variant stats, lift, and optional segment breakdown.

Return type:

EmpiricalSummary

Raises:
  • ValueError – If the experiment does not have exactly 2 variants.

  • SchemaViolation – If observed data fails validation.