pytyche.validation¶
Runtime validators for v2 contracts.
These validators make the contract types executable — they enforce invariants at construction time rather than discovering violations downstream.
Used by both generators and loaders. Fail-closed: no silent acceptance of malformed data.
Functions:
validate_observed_data(data)— checks visitor schema, dtypes, invariants.validate_alignment(array, data)— confirms array length matches visitors.validate_rule(rule, data)— confirms rule features exist and are type-compatible.
Functions
|
Confirm that a per-visitor array is aligned with concatenated visitors. |
|
Validate that all variant DataFrames conform to |
|
Confirm that a segment rule's features exist in visitor data and have compatible types. |
Exceptions
Raised when a per-visitor array is misaligned with visitor rows. |
|
Raised when a segment rule references invalid features or types. |
|
Raised when observed data violates the visitor schema contract. |
- exception pytyche.validation.SchemaViolation[source]¶
Bases:
ExceptionRaised when observed data violates the visitor schema contract.
- exception pytyche.validation.AlignmentViolation[source]¶
Bases:
ExceptionRaised when a per-visitor array is misaligned with visitor rows.
- exception pytyche.validation.RuleViolation[source]¶
Bases:
ExceptionRaised when a segment rule references invalid features or types.
- pytyche.validation.validate_observed_data(data, *, strict=True)[source]¶
Validate that all variant DataFrames conform to
VISITOR_SCHEMA.Per-variant checks:
All required columns are present.
Column dtypes are compatible with the schema.
revenue >= 0for all rows.No duplicate
visitor_idwithin a variant.n_visitors,n_conversions,total_revenuematch the DataFrame contents.Every row’s
variantcolumn matchesVariantData.name.Every row’s
experiment_idmatchesdata.experiment_id.
Cross-variant checks:
No
visitor_idappears in more than one variant (a visitor can only be assigned to one arm).
Strict-mode checks (
strict=True, the default):All variants have the same set of extra feature columns (beyond
VISITOR_SCHEMA).Extra feature columns have consistent dtypes across variants.
Set
strict=Falsewhen feature-column asymmetry across variants is intentional. Example: a treatment arm collects an extra survey response column that doesn’t exist for control:# Treatment adds a post-checkout "why did you buy?" column. # Control visitors never see the survey, so the column is absent. validate_observed_data(data, strict=False)
- Parameters:
data (
ObservedExperimentData) – The observed experiment data to validate.strict (
bool) – IfTrue(default), require feature-column consistency across variants. IfFalse, skip cross-variant column checks.
- Raises:
SchemaViolation – On any violation, with a message identifying the variant and the specific problem.
- Return type:
None
- pytyche.validation.validate_alignment(array, data)[source]¶
Confirm that a per-visitor array is aligned with concatenated visitors.
The expected length is the sum of
n_visitorsacross all variants, which equals the row count of:pd.concat([v.visitors for v in data.variants], ignore_index=True)
- Parameters:
array (
AlignedVisitorArray) – The aligned array to check.data (
ObservedExperimentData) – The experiment data providing the visitor count.
- Raises:
AlignmentViolation – If the array length doesn’t match.
- Return type:
None
- pytyche.validation.validate_rule(rule, data)[source]¶
Confirm that a segment rule’s features exist in visitor data and have compatible types.
Checks:
Each clause’s
featurecolumn exists in at least one variant’s DataFrame.Numeric rules (
ComparisonRule,BetweenRule) reference numeric columns.Categorical rules (
EqRule,InRule) reference non-numeric columns.
Does NOT enforce allowed categorical values — no domain registry in Phase 1 scope.
- Parameters:
rule (
SegmentRule) – The segment rule to validate.data (
ObservedExperimentData) – The experiment data providing the column schema.
- Raises:
RuleViolation – If a feature is missing or type-incompatible.
- Return type:
None