Quickstart

The shortest path to a fitted joint hurdle BCF on a synthetic adaptive-enrichment dataset. Runs in ~20 seconds on JAX-CPU.

import numpy as np
import pandas as pd
from pytyche import generate, GPUBCFConfig, fit_hurdle_bcf

SEGMENTS = {
    "responders":     {"pct": 0.4, "base_conv": 0.08, "treatment_effect": 0.10,
                       "aov_mu": 3.5, "aov_sigma": 0.5, "treatment_aov_mu_shift": 0.15},
    "non_responders": {"pct": 0.6, "base_conv": 0.06, "treatment_effect": 0.0,
                       "aov_mu": 3.3, "aov_sigma": 0.5, "treatment_aov_mu_shift": 0.0},
}
bundle = generate(n_visitors=800, segments=SEGMENTS,
                  metric="revenue_per_visitor", seed=0)

control_df = bundle.observed.variants[0].visitors
treatment_df = bundle.observed.variants[1].visitors
visitors = pd.concat([control_df, treatment_df], ignore_index=True)

seg_to_idx = {name: i for i, name in enumerate(SEGMENTS)}
X = visitors["segment"].map(seg_to_idx).to_numpy().reshape(-1, 1).astype(np.float32)
Z = (visitors["variant"] == "treatment").to_numpy().astype(np.float32)
Y_rev = visitors["revenue"].to_numpy().astype(np.float32)
propensity = np.full(len(visitors), 0.5, dtype=np.float32)

result = fit_hurdle_bcf(X, Z, Y_rev, propensity, GPUBCFConfig(
    num_burnin=40, num_mcmc=80, num_trees_mu=30, num_trees_tau=15,
    max_depth=4, num_gfr_sweeps=2, random_seed=0,
))
# result.rpv_cate_samples → (n_visitors, 80) per-visitor CATE posterior draws

The full walk-through — including data unpacking, posterior interpretation at the segment level, and ground-truth comparison — is in the first hurdle BCF fit tutorial.

For the why — what the joint hurdle BCF is doing, why segments are discovered rather than declared, what calibration buys you — start with the overview.

What’s next

  • Scale up: the tutorial above sizes for JAX-CPU. For realistic experimentation scale (50k–2M visitors per round), install the [gpu] extra and bump num_burnin / num_mcmc / num_trees_mu / num_trees_tau. The library is designed around GPU execution at scale; the per-round fit hits 4.5–8.6× speedup at n=750k versus the CPU StochTree backend.

  • Adaptive-enrichment loop: the single-fit example above is a pre-cursor to the multi-round sequential targeting loop. See sequential targeting for the full round-over-round design.

  • Calibration: BCF posteriors at production scale are systematically overconfident. The BCF calibration at scale concept doc explains why and what to do about it.