pytyche.bcf.config¶

Configuration dataclass, result types, and small utilities for the GPU BCF.

This module holds the user-facing configuration object (GPUBCFConfig), the three result dataclasses returned by the fit_* entry points, the formula-driven compute_num_trees_tau helper, and the leaf-index dtype selector used to size heap-layout tree arrays. Pure types and utilities — no JIT-compiled code, no GPU device handles, no module-level state. Importing this module is cheap and triggers no GPU work.

Import graph¶

gpu_bcf_config depends on JAX (for the leaf-index dtype helper), numpy (for result-array typing), and scipy.stats (for the inverse-normal quantile in compute_num_trees_tau). It does not import from any sibling gpu_bcf_* module. The orchestrator and downstream modules import FROM here, never the other way around.

Contents¶

_leaf_index_dtype — smallest unsigned int dtype for heap node indices at a given tree depth. GPUBCFConfig — frozen dataclass of MCMC and prior hyperparameters for the GPU BCF. compute_num_trees_tau — formula for the minimum tau-forest tree count at a target CI coverage. ContinuousBCFResult — result container for fit_continuous_bcf. BinaryBCFResult — result container for fit_binary_bcf. HurdleBCFResult — result container for fit_hurdle_bcf.

Functions

compute_num_trees_tau(n[, d_tau, sigma_tau, ...])

Formula-driven tau tree count for target CI coverage.

Classes

`BinaryBCFResult`(mu_samples, tau_samples, ...)	Result from binary (probit) BCF.
`ContinuousBCFResult`(mu_samples, tau_samples, ...)	Result from continuous BCF.
`GPUBCFConfig`([num_burnin, num_mcmc, ...])	Sampling configuration for GPU BCF via bartz.
`HurdleBCFResult`(rpv_cate_samples, p0_mean, ...)	Result from joint shared-tree hurdle BCF.

class pytyche.bcf.config.GPUBCFConfig(num_burnin=200, num_mcmc=200, num_trees_mu=200, num_trees_tau=50, max_depth=6, alpha_mu=0.95, beta_mu=2.0, alpha_tau=0.75, beta_tau=3.0, num_cuts=100, random_seed=42, num_chains=1, diagnostic_interval=50, thin_factor=1, num_gfr_sweeps=5, min_samples_leaf=5, gfr_backend='gpu', trace_path=None, var_tau_sev=0.5, kappa_sev=1.0, tau0_a_prior=1.0, tau0_b_prior=1.0, freeze_gamma=False, retain_channel_samples=True, focal_severity=False, per_leaf_gamma=False, retain_topology_history=False)[source]¶

Bases: object

Sampling configuration for GPU BCF via bartz.

num_burnin¶: Number of MCMC burn-in iterations (discarded).

num_mcmc¶: Number of MCMC samples to retain for posterior inference.

num_trees_mu¶: Number of trees in the prognostic (mu) forest.

num_trees_tau¶: Number of trees in the treatment effect (tau) forest.

max_depth¶: Maximum tree depth (controls p_nonterminal array length).

alpha_mu / beta_mu: Tree prior hyperparameters for mu forest.

alpha_tau / beta_tau: Tree prior hyperparameters for tau forest (tighter = more regularized).

num_cuts¶: Number of quantile-based split cutpoints per covariate.

random_seed¶: Seed for JAX PRNG.

num_chains¶: Number of parallel MCMC chains (vmapped). 1 = single-chain (legacy).

diagnostic_interval¶: Iterations per chunk for between-chunk diagnostics. Must divide both num_burnin and num_mcmc evenly.

thin_factor¶: Keep every thin_factor-th sample during MCMC (1 = keep all).

retain_topology_history¶: If True, retain per-iter per-tree topology hashes and move metadata on HurdleBCFResult.topology_history for mobility diagnostics. Default False (no retention, byte-identical to pre-feature behaviour).

Parameters:

num_burnin (int)
num_mcmc (int)
num_trees_mu (int)
num_trees_tau (int)
max_depth (int)
alpha_mu (float)
beta_mu (float)
alpha_tau (float)
beta_tau (float)
num_cuts (int)
random_seed (int)
num_chains (int)
diagnostic_interval (int)
thin_factor (int)
num_gfr_sweeps (int)
min_samples_leaf (int)
gfr_backend (str)
trace_path (str | None)
var_tau_sev (float)
kappa_sev (float)
tau0_a_prior (float)
tau0_b_prior (float)
freeze_gamma (bool)
retain_channel_samples (bool)
focal_severity (bool)
per_leaf_gamma (bool)
retain_topology_history (bool)

pytyche.bcf.config.compute_num_trees_tau(n, d_tau=3.0, sigma_tau=0.5, coverage=0.9, floor=50, ceiling=400)[source]¶

Formula-driven tau tree count for target CI coverage.

T_min = ceil((d_tau * sigma_tau * sqrt(n) / (2 * z))^(2/3))

The tau forest’s piecewise-constant approximation has O(1) bias that dominates the posterior at large n (O(1/sqrt(n)) concentration). This formula computes the minimum T to keep bias below the CI half-width at the target coverage level.

Parameters:

n (int)
d_tau (float)
sigma_tau (float)
coverage (float)
floor (int)
ceiling (int)

Return type:

int

class pytyche.bcf.config.ContinuousBCFResult(mu_samples, tau_samples, sigma2_samples, y_bar, y_std, wall_clock_seconds, *, observed=None, is_calibrated=False, calibration=None)[source]¶

Bases: object

Result from continuous BCF.

mu_samples¶: (n, num_mcmc) prognostic predictions (standardized).

tau_samples¶: (n, num_mcmc) treatment effects (standardized).

sigma2_samples¶: (num_mcmc,) error variance (standardized).

y_bar¶: Mean of the outcome used for standardization.

y_std¶: Standard deviation of the outcome used for standardization.

wall_clock_seconds¶: Wall-clock time for the fit in seconds.

observed¶: The ObservedExperimentData the fit consumed, attached to the result so the analysis methods can reach the visitor rows and variant metadata. None when constructed by private raw-array helpers; populated by the public fit wrappers.

is_calibrated¶: True only after apply_calibration has been called on this result. Defaults to False.

calibration¶: The Calibration artifact attached by apply_calibration; None on fresh fits. The v0.2 artifact scope is interval corrections only — it is consumed where interval summaries are built, never to transform sample arrays.

Parameters:

mu_samples (ndarray)
tau_samples (ndarray)
sigma2_samples (ndarray)
y_bar (float)
y_std (float)
wall_clock_seconds (float)
observed (ObservedExperimentData | None)
is_calibrated (bool)
calibration (Calibration | None)

thompson_allocation(segments, epsilon=0.02)[source]¶

Per-segment traffic split: each arm’s weight is the posterior probability that it is the segment’s best arm.

Thompson sampling at segment granularity: per segment, each posterior draw votes for its best arm (the largest member-mean contrast, or control when none is positive); an arm’s weight is its win frequency over draws.

Parameters:

segments (Sequence[DiscoveredSegment]) – Segments to allocate over (only id and rule are consumed); membership is resolved against self.observed.
epsilon (float) – Safety-net exploration floor — arms below epsilon / K are raised to the floor and the rest rescaled, so no arm’s traffic is starved to zero; inert when every arm is already above it. NOT the dial for how much traffic stays on control — that is min_control_weight / min_explore_weight on pt.sequential_experiment; rarely worth overriding.

Return type:

dict[int, dict[str, float]]

Returns:

{segment.id: {variant_name: weight}} — inner dicts in variant order (control first), each summing to 1.

Raises:

ValueError – When self.observed is None.

fit_policy_tree(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶

Discover interpretable segments from the posterior’s per-visitor treatment effects, by fitting a shallow decision tree.

Each visitor is labeled with the arm the posterior expects to be best for them (largest posterior-mean lift, or control when no lift is positive); a multiclass decision tree is fit on the visitors’ features, and each leaf becomes a DiscoveredSegment carrying an exact membership rule, gate estimate/CI, per-arm best probabilities, Thompson allocation, and bootstrap-replicability stability.

Parameters:

max_depth (int) – Maximum tree depth.
min_segment_share (float) – Minimum fraction of visitors per leaf (sklearn min_weight_fraction_leaf).
n_bootstrap (int) – Bootstrap tree refits behind stability_score; 0 skips stability (NaN sentinel plus UserWarning).
bootstrap_seed (int) – Seed for the bootstrap resampling RNG.

Return type:

PolicyTreeResult

Returns:

PolicyTreeResult with one segment per leaf, ordered by sklearn leaf id; result.observed is self.observed by identity.

Raises:

ValueError – When self.observed is None.

apply_calibration(calibration)[source]¶

Return a new posterior with calibration attached.

Attach, don’t transform: the artifact is stashed on the returned copy (is_calibrated=True); every sample array is shared with this posterior by identity. The correction currently applies to intervals only — probabilities and expected losses stay raw; corrected CIs appear where interval summaries are built. K = 2 experiments only (per-contrast recalibration for K >= 3 is not yet implemented).

Parameters:

calibration (Calibration) – SBC-fitted Calibration whose regime (metric, n_treatments) must match self.observed.

Return type:

ContinuousBCFResult

Returns:

New ContinuousBCFResult carrying the artifact; the original is untouched.

Raises:

ValueError – When self.observed is None, or on a regime mismatch (message names the mismatched dimensions).
NotImplementedError – At K >= 3.

recommendation_summary(treatment, segment=None, *, thresholds=None, min_practical_effect=0.02)[source]¶

Act-now SHIP / CONTINUE / STOP recommendation for one treatment.

The treatment’s metric-native contrast draws are scoped (segment=None is the global all-visitors snapshot; a segment restricts to its rule’s members), reduced to per-draw mean lift, and summarized under the legacy compare.variants decision rule. v0.2 raw scope: probabilities and expected losses come from the raw draws even on a calibrated posterior — interval corrections land where intervals are built.

Parameters:

treatment (str) – Treatment variant name (vs control).
segment (DiscoveredSegment | None) – None for the global snapshot; a DiscoveredSegment restricts the computation to its members.
thresholds (DecisionThresholds | None) – Decision thresholds; DecisionThresholds() defaults when None.
min_practical_effect (float) – Minimum meaningful lift for probability_better / probability_harmful.

Return type:

RecommendationSummary

Returns:

RecommendationSummary with the decision, its evidence, and expected_value_of_one_more_round always populated (closed-form preposterior EVSI; formula in docs/concepts/decision-theoretic-inputs.md).

Raises:

ValueError – When self.observed is None, when treatment is not a treatment name, or when the segment’s rule matches zero visitors.

analyze(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶

The canonical one-call analysis summary for this posterior.

Composes per-treatment Comparison summaries, the embedded policy-tree segmentation (keyword arguments forward to it), the global RecommendationSummary for the best challenger, and the posterior-mean per-visitor CATEs. Anything needing posterior samples goes through analysis.posterior.

Parameters:

max_depth (int) – Embedded policy tree depth.
min_segment_share (float) – Minimum per-leaf population share.
n_bootstrap (int) – Stability bootstrap count (0 skips stability with a UserWarning).
bootstrap_seed (int) – Stability bootstrap seed.

Return type:

AnalysisResult

Returns:

AnalysisResult; analysis.is_calibrated reads through to this posterior’s flag.

Raises:

ValueError – When self.observed is None.

evaluate_against_truth(tree, truth)[source]¶

Sim-mode evaluation of tree’s policy against ground truth.

Parameters:

tree (PolicyTreeResult) – The fitted policy whose assignments are evaluated.
truth (CalibrationTruth | None) – Ground truth from the simulation path; None in real-data mode (raises — nothing to evaluate against).

Return type:

TruthComparison

Returns:

TruthComparison (cate_rmse, policy_accuracy, and the realized-RPV trio with the oracle gap).

Raises:

RuntimeError – When truth is None (real-data mode).
ValueError – When self.observed is None or the truth lacks the K-appropriate contrast / potential-outcome fields.

has_credible_segments(threshold=0.8)[source]¶

Whether some discovered segment clears threshold stability.

Runs fit_policy_tree at its defaults (deterministic given the default bootstrap_seed) and checks for a segment with stability_score >= threshold. The 0.80 default matches the default graduation rule’s SHIP-gate stability threshold.

Parameters:: threshold (float) – Minimum bootstrap-replicability stability score.
Return type:: bool
Returns:: True iff at least one discovered segment clears it.

has_decomposition()[source]¶

Whether this posterior carries the conversion/severity split.

Return type:: bool
Returns:: False — only hurdle posteriors carry the conversion/severity decomposition.

class pytyche.bcf.config.BinaryBCFResult(mu_samples, tau_samples, wall_clock_seconds, *, observed=None, is_calibrated=False, calibration=None)[source]¶

Bases: object

Result from binary (probit) BCF.

mu_samples¶: (n, num_mcmc) prognostic predictions (probit scale).

tau_samples¶: (n, num_mcmc) treatment effects (probit scale).

wall_clock_seconds¶: Wall-clock time for the fit in seconds.

observed¶: The ObservedExperimentData the fit consumed, attached to the result so the analysis methods can reach the visitor rows and variant metadata. None when constructed by private raw-array helpers; populated by the public fit wrappers.

is_calibrated¶: True only after apply_calibration has been called on this result. Defaults to False.

calibration¶: The Calibration artifact attached by apply_calibration; None on fresh fits. The v0.2 artifact scope is interval corrections only — it is consumed where interval summaries are built, never to transform sample arrays.

Parameters:

mu_samples (ndarray)
tau_samples (ndarray)
wall_clock_seconds (float)
observed (ObservedExperimentData | None)
is_calibrated (bool)
calibration (Calibration | None)

thompson_allocation(segments, epsilon=0.02)[source]¶

Per-segment traffic split: each arm’s weight is the posterior probability that it is the segment’s best arm.

Parameters:

segments (Sequence[DiscoveredSegment]) – Segments to allocate over (only id and rule are consumed); membership is resolved against self.observed.
epsilon (float) – Safety-net exploration floor — arms below epsilon / K are raised to the floor and the rest rescaled, so no arm’s traffic is starved to zero; inert when every arm is already above it. NOT the dial for how much traffic stays on control — that is min_control_weight / min_explore_weight on pt.sequential_experiment; rarely worth overriding.

Return type:

dict[int, dict[str, float]]

Returns:

{segment.id: {variant_name: weight}} — inner dicts in variant order (control first), each summing to 1.

Raises:

ValueError – When self.observed is None.

fit_policy_tree(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶

Discover interpretable segments from the posterior’s per-visitor treatment effects, by fitting a shallow decision tree.

Parameters:

max_depth (int) – Maximum tree depth.
min_segment_share (float) – Minimum fraction of visitors per leaf (sklearn min_weight_fraction_leaf).
n_bootstrap (int) – Bootstrap tree refits behind stability_score; 0 skips stability (NaN sentinel plus UserWarning).
bootstrap_seed (int) – Seed for the bootstrap resampling RNG.

Return type:

PolicyTreeResult

Returns:

PolicyTreeResult with one segment per leaf, ordered by sklearn leaf id; result.observed is self.observed by identity.

Raises:

ValueError – When self.observed is None.

apply_calibration(calibration)[source]¶

Return a new posterior with calibration attached.

Parameters:

calibration (Calibration) – SBC-fitted Calibration whose regime (metric, n_treatments) must match self.observed.

Return type:

BinaryBCFResult

Returns:

New BinaryBCFResult carrying the artifact; the original is untouched.

Raises:

ValueError – When self.observed is None, or on a regime mismatch (message names the mismatched dimensions).
NotImplementedError – At K >= 3.

recommendation_summary(treatment, segment=None, *, thresholds=None, min_practical_effect=0.02)[source]¶

Act-now SHIP / CONTINUE / STOP recommendation for one treatment.

Parameters:

treatment (str) – Treatment variant name (vs control).
segment (DiscoveredSegment | None) – None for the global snapshot; a DiscoveredSegment restricts the computation to its members.
thresholds (DecisionThresholds | None) – Decision thresholds; DecisionThresholds() defaults when None.
min_practical_effect (float) – Minimum meaningful lift for probability_better / probability_harmful.

Return type:

RecommendationSummary

Returns:

Raises:

ValueError – When self.observed is None, when treatment is not a treatment name, or when the segment’s rule matches zero visitors.

analyze(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶

The canonical one-call analysis summary for this posterior.

Parameters:

max_depth (int) – Embedded policy tree depth.
min_segment_share (float) – Minimum per-leaf population share.
n_bootstrap (int) – Stability bootstrap count (0 skips stability with a UserWarning).
bootstrap_seed (int) – Stability bootstrap seed.

Return type:

AnalysisResult

Returns:

AnalysisResult; analysis.is_calibrated reads through to this posterior’s flag.

Raises:

ValueError – When self.observed is None.

evaluate_against_truth(tree, truth)[source]¶

Sim-mode evaluation of tree’s policy against ground truth.

Parameters:

tree (PolicyTreeResult) – The fitted policy whose assignments are evaluated.
truth (CalibrationTruth | None) – Ground truth from the simulation path; None in real-data mode (raises — nothing to evaluate against).

Return type:

TruthComparison

Returns:

TruthComparison (cate_rmse, policy_accuracy, and the realized-RPV trio with the oracle gap).

Raises:

RuntimeError – When truth is None (real-data mode).
ValueError – When self.observed is None or the truth lacks the K-appropriate contrast / potential-outcome fields.

has_credible_segments(threshold=0.8)[source]¶

Whether some discovered segment clears threshold stability.

Parameters:: threshold (float) – Minimum bootstrap-replicability stability score.
Return type:: bool
Returns:: True iff at least one discovered segment clears it.

has_decomposition()[source]¶

Whether this posterior carries the conversion/severity split.

Return type:: bool
Returns:: False — only hurdle posteriors carry the conversion/severity decomposition.

class pytyche.bcf.config.HurdleBCFResult(rpv_cate_samples, p0_mean, p1_mean, sev0_mean, sev1_mean, tau0_samples, tau_hat_quantiles, wall_clock_seconds, num_chains=1, num_gfr_sweeps=0, diagnostics=None, phase_timing=None, p0_samples=None, p1_samples=None, sev0_samples=None, sev1_samples=None, p_samples=None, sev_samples=None, topology_history=None, *, observed=None, is_calibrated=False, calibration=None, pooling)[source]¶

Bases: object

Result from joint shared-tree hurdle BCF.

Each tree simultaneously estimates conversion (probit) and severity (log-revenue) parameters via shared tree structure. This couples the two channels so splits are jointly informative.

RPV CATEs are composed on-GPU (float32) and transferred to CPU for policy tree fitting. Channel-level per-draw arrays (p0, p1, sev0, sev1) are retained by default (retain_channel_samples=True) — the conversion/severity decomposition is the headline output of the hurdle approach and needs the per-draw channel arrays for its credible intervals. Set retain_channel_samples=False to skip the GPU→CPU transfer when memory matters more than the decomposition (e.g. large-n sweep contexts that only consume the composed RPV contrasts).

When num_chains > 1, samples are concatenated across chains: S_total = (num_mcmc / thin_factor) * num_chains.

Arm-count dispatch (K = int(Z.max()) + 1). At K = 2 (binary arm) the legacy paired fields are populated — p0_mean/p1_mean/sev0_mean/ sev1_mean (n,) and, when retain_channel_samples=True, p0_samples/p1_samples/sev0_samples/sev1_samples (n, S_total) — rpv_cate_samples is (n, S_total), and the per-arm fields p_samples/sev_samples are None. At K >= 3 (multi-arm) the per-arm fields are populated instead — p_samples/sev_samples (n, S_total, K) (when retained) and rpv_cate_samples (n, S_total, K - 1) (the jointly sampled contrast posterior) — and the legacy paired fields are None. The two field families are never populated together. tau0_samples (S_total,) and the sigma2_samples = 1 / tau0_samples property are scalar at every K (each visitor sees one outcome, so the severity residual is scalar per visitor — there is no per-arm severity precision).

The topology_history field is populated only when the producing fit set GPUBCFConfig.retain_topology_history=True. When the flag is off (default), the field is None and the fit’s wall-clock + PRNG state is bitwise-identical to HEAD pre-this-change.

rpv_cate_samples¶: (n, S_total) float32 — composed on GPU, transferred to CPU.

p0_mean¶: (n,) float32 — E[Φ(μ_b + b₀·τ_b)]; None at K>=3.

p1_mean¶: (n,) float32 — E[Φ(μ_b + b₁·τ_b)]; None at K>=3.

sev0_mean¶: (n,) float32 — E[exp(μ_c + b₀·τ_c + σ²/2)]; None at K>=3.

sev1_mean¶: (n,) float32 — E[exp(μ_c + b₁·τ_c + σ²/2)]; None at K>=3.

tau0_samples¶: (S_total,) float32 — global precision.

tau_hat_quantiles¶: (S_total, 5) [q05,q25,q50,q75,q95] or None.

wall_clock_seconds¶: Wall-clock time for the fit in seconds.

num_chains¶: Number of parallel MCMC chains used.

num_gfr_sweeps¶: Number of GFR warm-start sweeps performed.

diagnostics¶: Dict of diagnostic values (rhat_tau0, per_chain_ess, etc.), or None.

phase_timing¶: Dict of per-phase wall-clock breakdown, or None.

p0_samples¶: jax.Array (n, S_total) — P(convert|control) per draw; None if not retained.

p1_samples¶: jax.Array (n, S_total) — P(convert|treated) per draw; None if not retained.

sev0_samples¶: jax.Array (n, S_total) — E[sev|control,convert] per draw; None if not retained.

sev1_samples¶: jax.Array (n, S_total) — E[sev|treated,convert] per draw; None if not retained.

p_samples¶: jax.Array (n, S_total, K) — per-arm P(convert) per draw; None at K=2.

sev_samples¶: jax.Array (n, S_total, K) — per-arm E[sev|convert] per draw; None at K=2.

topology_history¶: Topology retention trace; populated only when the producing fit set GPUBCFConfig.retain_topology_history=True. None otherwise.

observed¶: The ObservedExperimentData the fit consumed, attached to the result so the analysis methods can reach the visitor rows and variant metadata. None when constructed by private raw-array helpers; populated by the public fit wrappers.

is_calibrated¶: True only after apply_calibration has been called on this result. Defaults to False.

calibration¶: The Calibration artifact attached by apply_calibration; None on fresh fits. The v0.2 artifact scope is interval corrections only — it is consumed where interval summaries are built, never to transform sample arrays.

pooling¶: Provenance of the fit: "joint" = shared-tree canonical fit; "independent" = two-stage baseline (binary + continuous fitted separately). Required — caller must always populate.

Parameters:

rpv_cate_samples (ndarray)
p0_mean (ndarray | None)
p1_mean (ndarray | None)
sev0_mean (ndarray | None)
sev1_mean (ndarray | None)
tau0_samples (ndarray)
tau_hat_quantiles (ndarray | None)
wall_clock_seconds (float)
num_chains (int)
num_gfr_sweeps (int)
diagnostics (dict | None)
phase_timing (dict | None)
p0_samples (Any | None)
p1_samples (Any | None)
sev0_samples (Any | None)
sev1_samples (Any | None)
p_samples (Any | None)
sev_samples (Any | None)
topology_history (TopologyHistory | None)
observed (ObservedExperimentData | None)
is_calibrated (bool)
calibration (Calibration | None)
pooling (Literal['joint', 'independent'])

property sigma2_samples: ndarray¶

Return 1 / tau0_samples as a sigma² view.

Backward-compat shim for downstream code that consumes the variance parameterisation rather than the precision one.

thompson_allocation(segments, epsilon=0.02)[source]¶

Per-segment traffic split: each arm’s weight is the posterior probability that it is the segment’s best arm.

Parameters:

segments (Sequence[DiscoveredSegment]) – Segments to allocate over (only id and rule are consumed); membership is resolved against self.observed.
epsilon (float) – Safety-net exploration floor — arms below epsilon / K are raised to the floor and the rest rescaled, so no arm’s traffic is starved to zero; inert when every arm is already above it. NOT the dial for how much traffic stays on control — that is min_control_weight / min_explore_weight on pt.sequential_experiment; rarely worth overriding.

Return type:

dict[int, dict[str, float]]

Returns:

{segment.id: {variant_name: weight}} — inner dicts in variant order (control first), each summing to 1.

Raises:

ValueError – When self.observed is None.

fit_policy_tree(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶

Discover interpretable segments from the posterior’s per-visitor treatment effects, by fitting a shallow decision tree.

Parameters:

max_depth (int) – Maximum tree depth.
min_segment_share (float) – Minimum fraction of visitors per leaf (sklearn min_weight_fraction_leaf).
n_bootstrap (int) – Bootstrap tree refits behind stability_score; 0 skips stability (NaN sentinel plus UserWarning).
bootstrap_seed (int) – Seed for the bootstrap resampling RNG.

Return type:

PolicyTreeResult

Returns:

PolicyTreeResult with one segment per leaf, ordered by sklearn leaf id; result.observed is self.observed by identity.

Raises:

ValueError – When self.observed is None.

apply_calibration(calibration)[source]¶

Return a new posterior with calibration attached.

Parameters:

calibration (Calibration) – SBC-fitted Calibration whose regime (metric, n_treatments) must match self.observed.

Return type:

HurdleBCFResult

Returns:

New HurdleBCFResult carrying the artifact; the original is untouched.

Raises:

ValueError – When self.observed is None, or on a regime mismatch (message names the mismatched dimensions).
NotImplementedError – At K >= 3.

recommendation_summary(treatment, segment=None, *, thresholds=None, min_practical_effect=0.02)[source]¶

Act-now SHIP / CONTINUE / STOP recommendation for one treatment.

Parameters:

treatment (str) – Treatment variant name (vs control).
segment (DiscoveredSegment | None) – None for the global snapshot; a DiscoveredSegment restricts the computation to its members.
thresholds (DecisionThresholds | None) – Decision thresholds; DecisionThresholds() defaults when None.
min_practical_effect (float) – Minimum meaningful lift for probability_better / probability_harmful.

Return type:

RecommendationSummary

Returns:

Raises:

ValueError – When self.observed is None, when treatment is not a treatment name, or when the segment’s rule matches zero visitors.

analyze(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶

The canonical one-call analysis summary for this posterior.

Parameters:

max_depth (int) – Embedded policy tree depth.
min_segment_share (float) – Minimum per-leaf population share.
n_bootstrap (int) – Stability bootstrap count (0 skips stability with a UserWarning).
bootstrap_seed (int) – Stability bootstrap seed.

Return type:

AnalysisResult

Returns:

AnalysisResult; analysis.is_calibrated reads through to this posterior’s flag.

Raises:

ValueError – When self.observed is None.

evaluate_against_truth(tree, truth)[source]¶

Sim-mode evaluation of tree’s policy against ground truth.

Parameters:

tree (PolicyTreeResult) – The fitted policy whose assignments are evaluated.
truth (CalibrationTruth | None) – Ground truth from the simulation path; None in real-data mode (raises — nothing to evaluate against).

Return type:

TruthComparison

Returns:

TruthComparison (cate_rmse, policy_accuracy, and the realized-RPV trio with the oracle gap).

Raises:

RuntimeError – When truth is None (real-data mode).
ValueError – When self.observed is None or the truth lacks the K-appropriate contrast / potential-outcome fields.

has_credible_segments(threshold=0.8)[source]¶

Whether some discovered segment clears threshold stability.

Parameters:: threshold (float) – Minimum bootstrap-replicability stability score.
Return type:: bool
Returns:: True iff at least one discovered segment clears it.

has_decomposition()[source]¶

Whether this posterior carries the conversion/severity split.

Return type:: bool
Returns:: True — the hurdle posterior decomposes into the conversion and severity channels.