pytyche.bcf.config¶
Configuration dataclass, result types, and small utilities for the GPU BCF.
This module holds the user-facing configuration object (GPUBCFConfig), the
three result dataclasses returned by the fit_* entry points, the
formula-driven compute_num_trees_tau helper, and the leaf-index dtype
selector used to size heap-layout tree arrays. Pure types and utilities — no
JIT-compiled code, no GPU device handles, no module-level state. Importing
this module is cheap and triggers no GPU work.
Import graph¶
gpu_bcf_config depends on JAX (for the leaf-index dtype helper),
numpy (for result-array typing), and scipy.stats (for the inverse-normal
quantile in compute_num_trees_tau). It does not import from any sibling
gpu_bcf_* module. The orchestrator and downstream modules import FROM
here, never the other way around.
Contents¶
_leaf_index_dtype — smallest unsigned int dtype for heap node indices at a given tree depth.
GPUBCFConfig — frozen dataclass of MCMC and prior hyperparameters for the GPU BCF.
compute_num_trees_tau — formula for the minimum tau-forest tree count at a target CI coverage.
ContinuousBCFResult — result container for fit_continuous_bcf.
BinaryBCFResult — result container for fit_binary_bcf.
HurdleBCFResult — result container for fit_hurdle_bcf.
Functions
|
Formula-driven tau tree count for target CI coverage. |
Classes
|
Result from binary (probit) BCF. |
|
Result from continuous BCF. |
|
Sampling configuration for GPU BCF via bartz. |
|
Result from joint shared-tree hurdle BCF. |
- class pytyche.bcf.config.GPUBCFConfig(num_burnin=200, num_mcmc=200, num_trees_mu=200, num_trees_tau=50, max_depth=6, alpha_mu=0.95, beta_mu=2.0, alpha_tau=0.75, beta_tau=3.0, num_cuts=100, random_seed=42, num_chains=1, diagnostic_interval=50, thin_factor=1, num_gfr_sweeps=5, min_samples_leaf=5, gfr_backend='gpu', trace_path=None, var_tau_sev=0.5, kappa_sev=1.0, tau0_a_prior=1.0, tau0_b_prior=1.0, freeze_gamma=False, retain_channel_samples=True, focal_severity=False, per_leaf_gamma=False, retain_topology_history=False)[source]¶
Bases:
objectSampling configuration for GPU BCF via bartz.
- num_burnin¶
Number of MCMC burn-in iterations (discarded).
- num_mcmc¶
Number of MCMC samples to retain for posterior inference.
- num_trees_mu¶
Number of trees in the prognostic (mu) forest.
- num_trees_tau¶
Number of trees in the treatment effect (tau) forest.
- max_depth¶
Maximum tree depth (controls p_nonterminal array length).
- alpha_mu / beta_mu
Tree prior hyperparameters for mu forest.
- alpha_tau / beta_tau
Tree prior hyperparameters for tau forest (tighter = more regularized).
- num_cuts¶
Number of quantile-based split cutpoints per covariate.
- random_seed¶
Seed for JAX PRNG.
- num_chains¶
Number of parallel MCMC chains (vmapped). 1 = single-chain (legacy).
- diagnostic_interval¶
Iterations per chunk for between-chunk diagnostics. Must divide both num_burnin and num_mcmc evenly.
- thin_factor¶
Keep every thin_factor-th sample during MCMC (1 = keep all).
- retain_topology_history¶
If True, retain per-iter per-tree topology hashes and move metadata on
HurdleBCFResult.topology_historyfor mobility diagnostics. Default False (no retention, byte-identical to pre-feature behaviour).
- Parameters:
num_burnin (
int)num_mcmc (
int)num_trees_mu (
int)num_trees_tau (
int)max_depth (
int)alpha_mu (
float)beta_mu (
float)alpha_tau (
float)beta_tau (
float)num_cuts (
int)random_seed (
int)num_chains (
int)diagnostic_interval (
int)thin_factor (
int)num_gfr_sweeps (
int)min_samples_leaf (
int)gfr_backend (
str)trace_path (
str|None)var_tau_sev (
float)kappa_sev (
float)tau0_a_prior (
float)tau0_b_prior (
float)freeze_gamma (
bool)retain_channel_samples (
bool)focal_severity (
bool)per_leaf_gamma (
bool)retain_topology_history (
bool)
- pytyche.bcf.config.compute_num_trees_tau(n, d_tau=3.0, sigma_tau=0.5, coverage=0.9, floor=50, ceiling=400)[source]¶
Formula-driven tau tree count for target CI coverage.
T_min = ceil((d_tau * sigma_tau * sqrt(n) / (2 * z))^(2/3))
The tau forest’s piecewise-constant approximation has O(1) bias that dominates the posterior at large n (O(1/sqrt(n)) concentration). This formula computes the minimum T to keep bias below the CI half-width at the target coverage level.
- Parameters:
n (
int)d_tau (
float)sigma_tau (
float)coverage (
float)floor (
int)ceiling (
int)
- Return type:
int
- class pytyche.bcf.config.ContinuousBCFResult(mu_samples, tau_samples, sigma2_samples, y_bar, y_std, wall_clock_seconds, *, observed=None, is_calibrated=False, calibration=None)[source]¶
Bases:
objectResult from continuous BCF.
- mu_samples¶
(n, num_mcmc)prognostic predictions (standardized).
- tau_samples¶
(n, num_mcmc)treatment effects (standardized).
- sigma2_samples¶
(num_mcmc,)error variance (standardized).
- y_bar¶
Mean of the outcome used for standardization.
- y_std¶
Standard deviation of the outcome used for standardization.
- wall_clock_seconds¶
Wall-clock time for the fit in seconds.
- observed¶
The
ObservedExperimentDatathe fit consumed, attached to the result so the analysis methods can reach the visitor rows and variant metadata.Nonewhen constructed by private raw-array helpers; populated by the public fit wrappers.
- is_calibrated¶
Trueonly afterapply_calibrationhas been called on this result. Defaults toFalse.
- calibration¶
The
Calibrationartifact attached byapply_calibration;Noneon fresh fits. The v0.2 artifact scope is interval corrections only — it is consumed where interval summaries are built, never to transform sample arrays.
- Parameters:
mu_samples (
ndarray)tau_samples (
ndarray)sigma2_samples (
ndarray)y_bar (
float)y_std (
float)wall_clock_seconds (
float)observed (
ObservedExperimentData|None)is_calibrated (
bool)calibration (
Calibration|None)
- thompson_allocation(segments, epsilon=0.02)[source]¶
Per-segment traffic split: each arm’s weight is the posterior probability that it is the segment’s best arm.
Thompson sampling at segment granularity: per segment, each posterior draw votes for its best arm (the largest member-mean contrast, or control when none is positive); an arm’s weight is its win frequency over draws.
- Parameters:
segments (
Sequence[DiscoveredSegment]) – Segments to allocate over (onlyidandruleare consumed); membership is resolved againstself.observed.epsilon (
float) – Safety-net exploration floor — arms belowepsilon / Kare raised to the floor and the rest rescaled, so no arm’s traffic is starved to zero; inert when every arm is already above it. NOT the dial for how much traffic stays on control — that ismin_control_weight/min_explore_weightonpt.sequential_experiment; rarely worth overriding.
- Return type:
dict[int,dict[str,float]]- Returns:
{segment.id: {variant_name: weight}}— inner dicts in variant order (control first), each summing to 1.- Raises:
ValueError – When
self.observedisNone.
- fit_policy_tree(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶
Discover interpretable segments from the posterior’s per-visitor treatment effects, by fitting a shallow decision tree.
Each visitor is labeled with the arm the posterior expects to be best for them (largest posterior-mean lift, or control when no lift is positive); a multiclass decision tree is fit on the visitors’ features, and each leaf becomes a
DiscoveredSegmentcarrying an exact membership rule, gate estimate/CI, per-arm best probabilities, Thompson allocation, and bootstrap-replicability stability.- Parameters:
max_depth (
int) – Maximum tree depth.min_segment_share (
float) – Minimum fraction of visitors per leaf (sklearnmin_weight_fraction_leaf).n_bootstrap (
int) – Bootstrap tree refits behindstability_score;0skips stability (NaN sentinel plusUserWarning).bootstrap_seed (
int) – Seed for the bootstrap resampling RNG.
- Return type:
- Returns:
PolicyTreeResultwith one segment per leaf, ordered by sklearn leaf id;result.observedisself.observedby identity.- Raises:
ValueError – When
self.observedisNone.
- apply_calibration(calibration)[source]¶
Return a new posterior with calibration attached.
Attach, don’t transform: the artifact is stashed on the returned copy (
is_calibrated=True); every sample array is shared with this posterior by identity. The correction currently applies to intervals only — probabilities and expected losses stay raw; corrected CIs appear where interval summaries are built. K = 2 experiments only (per-contrast recalibration for K >= 3 is not yet implemented).- Parameters:
calibration (
Calibration) – SBC-fittedCalibrationwhose regime (metric, n_treatments) must matchself.observed.- Return type:
- Returns:
New
ContinuousBCFResultcarrying the artifact; the original is untouched.- Raises:
ValueError – When
self.observedisNone, or on a regime mismatch (message names the mismatched dimensions).NotImplementedError – At K >= 3.
- recommendation_summary(treatment, segment=None, *, thresholds=None, min_practical_effect=0.02)[source]¶
Act-now SHIP / CONTINUE / STOP recommendation for one treatment.
The treatment’s metric-native contrast draws are scoped (
segment=Noneis the global all-visitors snapshot; a segment restricts to its rule’s members), reduced to per-draw mean lift, and summarized under the legacycompare.variantsdecision rule. v0.2 raw scope: probabilities and expected losses come from the raw draws even on a calibrated posterior — interval corrections land where intervals are built.- Parameters:
treatment (
str) – Treatment variant name (vs control).segment (
DiscoveredSegment|None) –Nonefor the global snapshot; aDiscoveredSegmentrestricts the computation to its members.thresholds (
DecisionThresholds|None) – Decision thresholds;DecisionThresholds()defaults whenNone.min_practical_effect (
float) – Minimum meaningful lift forprobability_better/probability_harmful.
- Return type:
- Returns:
RecommendationSummarywith the decision, its evidence, andexpected_value_of_one_more_roundalways populated (closed-form preposterior EVSI; formula indocs/concepts/decision-theoretic-inputs.md).- Raises:
ValueError – When
self.observedisNone, when treatment is not a treatment name, or when the segment’s rule matches zero visitors.
- analyze(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶
The canonical one-call analysis summary for this posterior.
Composes per-treatment
Comparisonsummaries, the embedded policy-tree segmentation (keyword arguments forward to it), the globalRecommendationSummaryfor the best challenger, and the posterior-mean per-visitor CATEs. Anything needing posterior samples goes throughanalysis.posterior.- Parameters:
max_depth (
int) – Embedded policy tree depth.min_segment_share (
float) – Minimum per-leaf population share.n_bootstrap (
int) – Stability bootstrap count (0skips stability with aUserWarning).bootstrap_seed (
int) – Stability bootstrap seed.
- Return type:
- Returns:
AnalysisResult;analysis.is_calibratedreads through to this posterior’s flag.- Raises:
ValueError – When
self.observedisNone.
- evaluate_against_truth(tree, truth)[source]¶
Sim-mode evaluation of tree’s policy against ground truth.
- Parameters:
tree (
PolicyTreeResult) – The fitted policy whose assignments are evaluated.truth (
CalibrationTruth|None) – Ground truth from the simulation path;Nonein real-data mode (raises — nothing to evaluate against).
- Return type:
- Returns:
TruthComparison(cate_rmse, policy_accuracy, and the realized-RPV trio with the oracle gap).- Raises:
RuntimeError – When truth is
None(real-data mode).ValueError – When
self.observedisNoneor the truth lacks the K-appropriate contrast / potential-outcome fields.
- has_credible_segments(threshold=0.8)[source]¶
Whether some discovered segment clears threshold stability.
Runs
fit_policy_treeat its defaults (deterministic given the defaultbootstrap_seed) and checks for a segment withstability_score >= threshold. The 0.80 default matches the default graduation rule’s SHIP-gate stability threshold.- Parameters:
threshold (
float) – Minimum bootstrap-replicability stability score.- Return type:
bool- Returns:
Trueiff at least one discovered segment clears it.
- class pytyche.bcf.config.BinaryBCFResult(mu_samples, tau_samples, wall_clock_seconds, *, observed=None, is_calibrated=False, calibration=None)[source]¶
Bases:
objectResult from binary (probit) BCF.
- mu_samples¶
(n, num_mcmc)prognostic predictions (probit scale).
- tau_samples¶
(n, num_mcmc)treatment effects (probit scale).
- wall_clock_seconds¶
Wall-clock time for the fit in seconds.
- observed¶
The
ObservedExperimentDatathe fit consumed, attached to the result so the analysis methods can reach the visitor rows and variant metadata.Nonewhen constructed by private raw-array helpers; populated by the public fit wrappers.
- is_calibrated¶
Trueonly afterapply_calibrationhas been called on this result. Defaults toFalse.
- calibration¶
The
Calibrationartifact attached byapply_calibration;Noneon fresh fits. The v0.2 artifact scope is interval corrections only — it is consumed where interval summaries are built, never to transform sample arrays.
- Parameters:
mu_samples (
ndarray)tau_samples (
ndarray)wall_clock_seconds (
float)observed (
ObservedExperimentData|None)is_calibrated (
bool)calibration (
Calibration|None)
- thompson_allocation(segments, epsilon=0.02)[source]¶
Per-segment traffic split: each arm’s weight is the posterior probability that it is the segment’s best arm.
Thompson sampling at segment granularity: per segment, each posterior draw votes for its best arm (the largest member-mean contrast, or control when none is positive); an arm’s weight is its win frequency over draws.
- Parameters:
segments (
Sequence[DiscoveredSegment]) – Segments to allocate over (onlyidandruleare consumed); membership is resolved againstself.observed.epsilon (
float) – Safety-net exploration floor — arms belowepsilon / Kare raised to the floor and the rest rescaled, so no arm’s traffic is starved to zero; inert when every arm is already above it. NOT the dial for how much traffic stays on control — that ismin_control_weight/min_explore_weightonpt.sequential_experiment; rarely worth overriding.
- Return type:
dict[int,dict[str,float]]- Returns:
{segment.id: {variant_name: weight}}— inner dicts in variant order (control first), each summing to 1.- Raises:
ValueError – When
self.observedisNone.
- fit_policy_tree(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶
Discover interpretable segments from the posterior’s per-visitor treatment effects, by fitting a shallow decision tree.
Each visitor is labeled with the arm the posterior expects to be best for them (largest posterior-mean lift, or control when no lift is positive); a multiclass decision tree is fit on the visitors’ features, and each leaf becomes a
DiscoveredSegmentcarrying an exact membership rule, gate estimate/CI, per-arm best probabilities, Thompson allocation, and bootstrap-replicability stability.- Parameters:
max_depth (
int) – Maximum tree depth.min_segment_share (
float) – Minimum fraction of visitors per leaf (sklearnmin_weight_fraction_leaf).n_bootstrap (
int) – Bootstrap tree refits behindstability_score;0skips stability (NaN sentinel plusUserWarning).bootstrap_seed (
int) – Seed for the bootstrap resampling RNG.
- Return type:
- Returns:
PolicyTreeResultwith one segment per leaf, ordered by sklearn leaf id;result.observedisself.observedby identity.- Raises:
ValueError – When
self.observedisNone.
- apply_calibration(calibration)[source]¶
Return a new posterior with calibration attached.
Attach, don’t transform: the artifact is stashed on the returned copy (
is_calibrated=True); every sample array is shared with this posterior by identity. The correction currently applies to intervals only — probabilities and expected losses stay raw; corrected CIs appear where interval summaries are built. K = 2 experiments only (per-contrast recalibration for K >= 3 is not yet implemented).- Parameters:
calibration (
Calibration) – SBC-fittedCalibrationwhose regime (metric, n_treatments) must matchself.observed.- Return type:
- Returns:
New
BinaryBCFResultcarrying the artifact; the original is untouched.- Raises:
ValueError – When
self.observedisNone, or on a regime mismatch (message names the mismatched dimensions).NotImplementedError – At K >= 3.
- recommendation_summary(treatment, segment=None, *, thresholds=None, min_practical_effect=0.02)[source]¶
Act-now SHIP / CONTINUE / STOP recommendation for one treatment.
The treatment’s metric-native contrast draws are scoped (
segment=Noneis the global all-visitors snapshot; a segment restricts to its rule’s members), reduced to per-draw mean lift, and summarized under the legacycompare.variantsdecision rule. v0.2 raw scope: probabilities and expected losses come from the raw draws even on a calibrated posterior — interval corrections land where intervals are built.- Parameters:
treatment (
str) – Treatment variant name (vs control).segment (
DiscoveredSegment|None) –Nonefor the global snapshot; aDiscoveredSegmentrestricts the computation to its members.thresholds (
DecisionThresholds|None) – Decision thresholds;DecisionThresholds()defaults whenNone.min_practical_effect (
float) – Minimum meaningful lift forprobability_better/probability_harmful.
- Return type:
- Returns:
RecommendationSummarywith the decision, its evidence, andexpected_value_of_one_more_roundalways populated (closed-form preposterior EVSI; formula indocs/concepts/decision-theoretic-inputs.md).- Raises:
ValueError – When
self.observedisNone, when treatment is not a treatment name, or when the segment’s rule matches zero visitors.
- analyze(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶
The canonical one-call analysis summary for this posterior.
Composes per-treatment
Comparisonsummaries, the embedded policy-tree segmentation (keyword arguments forward to it), the globalRecommendationSummaryfor the best challenger, and the posterior-mean per-visitor CATEs. Anything needing posterior samples goes throughanalysis.posterior.- Parameters:
max_depth (
int) – Embedded policy tree depth.min_segment_share (
float) – Minimum per-leaf population share.n_bootstrap (
int) – Stability bootstrap count (0skips stability with aUserWarning).bootstrap_seed (
int) – Stability bootstrap seed.
- Return type:
- Returns:
AnalysisResult;analysis.is_calibratedreads through to this posterior’s flag.- Raises:
ValueError – When
self.observedisNone.
- evaluate_against_truth(tree, truth)[source]¶
Sim-mode evaluation of tree’s policy against ground truth.
- Parameters:
tree (
PolicyTreeResult) – The fitted policy whose assignments are evaluated.truth (
CalibrationTruth|None) – Ground truth from the simulation path;Nonein real-data mode (raises — nothing to evaluate against).
- Return type:
- Returns:
TruthComparison(cate_rmse, policy_accuracy, and the realized-RPV trio with the oracle gap).- Raises:
RuntimeError – When truth is
None(real-data mode).ValueError – When
self.observedisNoneor the truth lacks the K-appropriate contrast / potential-outcome fields.
- has_credible_segments(threshold=0.8)[source]¶
Whether some discovered segment clears threshold stability.
Runs
fit_policy_treeat its defaults (deterministic given the defaultbootstrap_seed) and checks for a segment withstability_score >= threshold. The 0.80 default matches the default graduation rule’s SHIP-gate stability threshold.- Parameters:
threshold (
float) – Minimum bootstrap-replicability stability score.- Return type:
bool- Returns:
Trueiff at least one discovered segment clears it.
- class pytyche.bcf.config.HurdleBCFResult(rpv_cate_samples, p0_mean, p1_mean, sev0_mean, sev1_mean, tau0_samples, tau_hat_quantiles, wall_clock_seconds, num_chains=1, num_gfr_sweeps=0, diagnostics=None, phase_timing=None, p0_samples=None, p1_samples=None, sev0_samples=None, sev1_samples=None, p_samples=None, sev_samples=None, topology_history=None, *, observed=None, is_calibrated=False, calibration=None, pooling)[source]¶
Bases:
objectResult from joint shared-tree hurdle BCF.
Each tree simultaneously estimates conversion (probit) and severity (log-revenue) parameters via shared tree structure. This couples the two channels so splits are jointly informative.
RPV CATEs are composed on-GPU (float32) and transferred to CPU for policy tree fitting. Channel-level per-draw arrays (p0, p1, sev0, sev1) are retained by default (
retain_channel_samples=True) — the conversion/severity decomposition is the headline output of the hurdle approach and needs the per-draw channel arrays for its credible intervals. Setretain_channel_samples=Falseto skip the GPU→CPU transfer when memory matters more than the decomposition (e.g. large-n sweep contexts that only consume the composed RPV contrasts).When num_chains > 1, samples are concatenated across chains: S_total = (num_mcmc / thin_factor) * num_chains.
Arm-count dispatch (
K = int(Z.max()) + 1). At K = 2 (binary arm) the legacy paired fields are populated —p0_mean/p1_mean/sev0_mean/sev1_mean(n,)and, whenretain_channel_samples=True,p0_samples/p1_samples/sev0_samples/sev1_samples(n, S_total)—rpv_cate_samplesis(n, S_total), and the per-arm fieldsp_samples/sev_samplesareNone. At K >= 3 (multi-arm) the per-arm fields are populated instead —p_samples/sev_samples(n, S_total, K)(when retained) andrpv_cate_samples(n, S_total, K - 1)(the jointly sampled contrast posterior) — and the legacy paired fields areNone. The two field families are never populated together.tau0_samples(S_total,)and thesigma2_samples = 1 / tau0_samplesproperty are scalar at every K (each visitor sees one outcome, so the severity residual is scalar per visitor — there is no per-arm severity precision).The
topology_historyfield is populated only when the producing fit setGPUBCFConfig.retain_topology_history=True. When the flag is off (default), the field isNoneand the fit’s wall-clock + PRNG state is bitwise-identical to HEAD pre-this-change.- rpv_cate_samples¶
(n, S_total)float32 — composed on GPU, transferred to CPU.
- p0_mean¶
(n,)float32 — E[Φ(μ_b + b₀·τ_b)]; None at K>=3.
- p1_mean¶
(n,)float32 — E[Φ(μ_b + b₁·τ_b)]; None at K>=3.
- sev0_mean¶
(n,)float32 — E[exp(μ_c + b₀·τ_c + σ²/2)]; None at K>=3.
- sev1_mean¶
(n,)float32 — E[exp(μ_c + b₁·τ_c + σ²/2)]; None at K>=3.
- tau0_samples¶
(S_total,)float32 — global precision.
- tau_hat_quantiles¶
(S_total, 5)[q05,q25,q50,q75,q95] or None.
- wall_clock_seconds¶
Wall-clock time for the fit in seconds.
- num_chains¶
Number of parallel MCMC chains used.
- num_gfr_sweeps¶
Number of GFR warm-start sweeps performed.
- diagnostics¶
Dict of diagnostic values (rhat_tau0, per_chain_ess, etc.), or None.
- phase_timing¶
Dict of per-phase wall-clock breakdown, or None.
- p0_samples¶
jax.Array
(n, S_total)— P(convert|control) per draw; None if not retained.
- p1_samples¶
jax.Array
(n, S_total)— P(convert|treated) per draw; None if not retained.
- sev0_samples¶
jax.Array
(n, S_total)— E[sev|control,convert] per draw; None if not retained.
- sev1_samples¶
jax.Array
(n, S_total)— E[sev|treated,convert] per draw; None if not retained.
- p_samples¶
jax.Array
(n, S_total, K)— per-arm P(convert) per draw; None at K=2.
- sev_samples¶
jax.Array
(n, S_total, K)— per-arm E[sev|convert] per draw; None at K=2.
- topology_history¶
Topology retention trace; populated only when the producing fit set
GPUBCFConfig.retain_topology_history=True.Noneotherwise.
- observed¶
The
ObservedExperimentDatathe fit consumed, attached to the result so the analysis methods can reach the visitor rows and variant metadata.Nonewhen constructed by private raw-array helpers; populated by the public fit wrappers.
- is_calibrated¶
Trueonly afterapply_calibrationhas been called on this result. Defaults toFalse.
- calibration¶
The
Calibrationartifact attached byapply_calibration;Noneon fresh fits. The v0.2 artifact scope is interval corrections only — it is consumed where interval summaries are built, never to transform sample arrays.
- pooling¶
Provenance of the fit:
"joint"= shared-tree canonical fit;"independent"= two-stage baseline (binary + continuous fitted separately). Required — caller must always populate.
- Parameters:
rpv_cate_samples (
ndarray)p0_mean (
ndarray|None)p1_mean (
ndarray|None)sev0_mean (
ndarray|None)sev1_mean (
ndarray|None)tau0_samples (
ndarray)tau_hat_quantiles (
ndarray|None)wall_clock_seconds (
float)num_chains (
int)num_gfr_sweeps (
int)diagnostics (
dict|None)phase_timing (
dict|None)p0_samples (
Any|None)p1_samples (
Any|None)sev0_samples (
Any|None)sev1_samples (
Any|None)p_samples (
Any|None)sev_samples (
Any|None)topology_history (
TopologyHistory|None)observed (
ObservedExperimentData|None)is_calibrated (
bool)calibration (
Calibration|None)pooling (
Literal['joint','independent'])
- property sigma2_samples: ndarray¶
Return
1 / tau0_samplesas a sigma² view.Backward-compat shim for downstream code that consumes the variance parameterisation rather than the precision one.
- thompson_allocation(segments, epsilon=0.02)[source]¶
Per-segment traffic split: each arm’s weight is the posterior probability that it is the segment’s best arm.
Thompson sampling at segment granularity: per segment, each posterior draw votes for its best arm (the largest member-mean contrast, or control when none is positive); an arm’s weight is its win frequency over draws.
- Parameters:
segments (
Sequence[DiscoveredSegment]) – Segments to allocate over (onlyidandruleare consumed); membership is resolved againstself.observed.epsilon (
float) – Safety-net exploration floor — arms belowepsilon / Kare raised to the floor and the rest rescaled, so no arm’s traffic is starved to zero; inert when every arm is already above it. NOT the dial for how much traffic stays on control — that ismin_control_weight/min_explore_weightonpt.sequential_experiment; rarely worth overriding.
- Return type:
dict[int,dict[str,float]]- Returns:
{segment.id: {variant_name: weight}}— inner dicts in variant order (control first), each summing to 1.- Raises:
ValueError – When
self.observedisNone.
- fit_policy_tree(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶
Discover interpretable segments from the posterior’s per-visitor treatment effects, by fitting a shallow decision tree.
Each visitor is labeled with the arm the posterior expects to be best for them (largest posterior-mean lift, or control when no lift is positive); a multiclass decision tree is fit on the visitors’ features, and each leaf becomes a
DiscoveredSegmentcarrying an exact membership rule, gate estimate/CI, per-arm best probabilities, Thompson allocation, and bootstrap-replicability stability.- Parameters:
max_depth (
int) – Maximum tree depth.min_segment_share (
float) – Minimum fraction of visitors per leaf (sklearnmin_weight_fraction_leaf).n_bootstrap (
int) – Bootstrap tree refits behindstability_score;0skips stability (NaN sentinel plusUserWarning).bootstrap_seed (
int) – Seed for the bootstrap resampling RNG.
- Return type:
- Returns:
PolicyTreeResultwith one segment per leaf, ordered by sklearn leaf id;result.observedisself.observedby identity.- Raises:
ValueError – When
self.observedisNone.
- apply_calibration(calibration)[source]¶
Return a new posterior with calibration attached.
Attach, don’t transform: the artifact is stashed on the returned copy (
is_calibrated=True); every sample array is shared with this posterior by identity. The correction currently applies to intervals only — probabilities and expected losses stay raw; corrected CIs appear where interval summaries are built. K = 2 experiments only (per-contrast recalibration for K >= 3 is not yet implemented).- Parameters:
calibration (
Calibration) – SBC-fittedCalibrationwhose regime (metric, n_treatments) must matchself.observed.- Return type:
- Returns:
New
HurdleBCFResultcarrying the artifact; the original is untouched.- Raises:
ValueError – When
self.observedisNone, or on a regime mismatch (message names the mismatched dimensions).NotImplementedError – At K >= 3.
- recommendation_summary(treatment, segment=None, *, thresholds=None, min_practical_effect=0.02)[source]¶
Act-now SHIP / CONTINUE / STOP recommendation for one treatment.
The treatment’s metric-native contrast draws are scoped (
segment=Noneis the global all-visitors snapshot; a segment restricts to its rule’s members), reduced to per-draw mean lift, and summarized under the legacycompare.variantsdecision rule. v0.2 raw scope: probabilities and expected losses come from the raw draws even on a calibrated posterior — interval corrections land where intervals are built.- Parameters:
treatment (
str) – Treatment variant name (vs control).segment (
DiscoveredSegment|None) –Nonefor the global snapshot; aDiscoveredSegmentrestricts the computation to its members.thresholds (
DecisionThresholds|None) – Decision thresholds;DecisionThresholds()defaults whenNone.min_practical_effect (
float) – Minimum meaningful lift forprobability_better/probability_harmful.
- Return type:
- Returns:
RecommendationSummarywith the decision, its evidence, andexpected_value_of_one_more_roundalways populated (closed-form preposterior EVSI; formula indocs/concepts/decision-theoretic-inputs.md).- Raises:
ValueError – When
self.observedisNone, when treatment is not a treatment name, or when the segment’s rule matches zero visitors.
- analyze(*, max_depth=3, min_segment_share=0.1, n_bootstrap=50, bootstrap_seed=0)[source]¶
The canonical one-call analysis summary for this posterior.
Composes per-treatment
Comparisonsummaries, the embedded policy-tree segmentation (keyword arguments forward to it), the globalRecommendationSummaryfor the best challenger, and the posterior-mean per-visitor CATEs. Anything needing posterior samples goes throughanalysis.posterior.- Parameters:
max_depth (
int) – Embedded policy tree depth.min_segment_share (
float) – Minimum per-leaf population share.n_bootstrap (
int) – Stability bootstrap count (0skips stability with aUserWarning).bootstrap_seed (
int) – Stability bootstrap seed.
- Return type:
- Returns:
AnalysisResult;analysis.is_calibratedreads through to this posterior’s flag.- Raises:
ValueError – When
self.observedisNone.
- evaluate_against_truth(tree, truth)[source]¶
Sim-mode evaluation of tree’s policy against ground truth.
- Parameters:
tree (
PolicyTreeResult) – The fitted policy whose assignments are evaluated.truth (
CalibrationTruth|None) – Ground truth from the simulation path;Nonein real-data mode (raises — nothing to evaluate against).
- Return type:
- Returns:
TruthComparison(cate_rmse, policy_accuracy, and the realized-RPV trio with the oracle gap).- Raises:
RuntimeError – When truth is
None(real-data mode).ValueError – When
self.observedisNoneor the truth lacks the K-appropriate contrast / potential-outcome fields.
- has_credible_segments(threshold=0.8)[source]¶
Whether some discovered segment clears threshold stability.
Runs
fit_policy_treeat its defaults (deterministic given the defaultbootstrap_seed) and checks for a segment withstability_score >= threshold. The 0.80 default matches the default graduation rule’s SHIP-gate stability threshold.- Parameters:
threshold (
float) – Minimum bootstrap-replicability stability score.- Return type:
bool- Returns:
Trueiff at least one discovered segment clears it.