verskyt.benchmarks¶

Benchmark suites for validating paper results and comparing performance.

Module: xor_suite¶

XOR benchmark suite for Tversky Neural Networks.

Reproduces XOR experiments from “Tversky Neural Networks: Psychologically Plausible Deep Learning with Differentiable Tversky Similarity” (Doumbouya et al., 2025).

Provides both fast development benchmarks and full paper replication capabilities.

class XORConfig(intersection_methods: ~typing.List[str | ~verskyt.core.similarity.IntersectionReduction] = <factory>, difference_methods: ~typing.List[str | ~verskyt.core.similarity.DifferenceReduction] = <factory>, normalization: ~typing.List[bool] = <factory>, feature_counts: ~typing.List[int] = <factory>, prototype_init: ~typing.List[str] = <factory>, feature_init: ~typing.List[str] = <factory>, random_seeds: ~typing.List[int] = <factory>, epochs: int = 1000, learning_rate: float = 0.1, convergence_threshold: float = 1.0)[source]¶

Bases: object

Configuration for XOR benchmark experiments.

intersection_methods: List[str | IntersectionReduction]¶

difference_methods: List[str | DifferenceReduction]¶

normalization: List[bool]¶

feature_counts: List[int]¶

prototype_init: List[str]¶

feature_init: List[str]¶

random_seeds: List[int]¶

epochs: int = 1000¶

learning_rate: float = 0.1¶

convergence_threshold: float = 1.0¶

property total_runs: int¶: Calculate total number of experimental runs.

__init__(intersection_methods: ~typing.List[str | ~verskyt.core.similarity.IntersectionReduction] = <factory>, difference_methods: ~typing.List[str | ~verskyt.core.similarity.DifferenceReduction] = <factory>, normalization: ~typing.List[bool] = <factory>, feature_counts: ~typing.List[int] = <factory>, prototype_init: ~typing.List[str] = <factory>, feature_init: ~typing.List[str] = <factory>, random_seeds: ~typing.List[int] = <factory>, epochs: int = 1000, learning_rate: float = 0.1, convergence_threshold: float = 1.0) → None¶

class XORResult(intersection_method: str, difference_method: str, normalize: bool, feature_count: int, prototype_init: str, feature_init: str, seed: int, final_loss: float, final_accuracy: float, converged: bool, training_time: float, loss_history: List[float] | None = None, accuracy_history: List[float] | None = None)[source]¶

Bases: object

Results from a single XOR training run.

intersection_method: str¶

difference_method: str¶

normalize: bool¶

feature_count: int¶

prototype_init: str¶

feature_init: str¶

seed: int¶

final_loss: float¶

final_accuracy: float¶

converged: bool¶

training_time: float¶

loss_history: List[float] | None = None¶

accuracy_history: List[float] | None = None¶

__init__(intersection_method: str, difference_method: str, normalize: bool, feature_count: int, prototype_init: str, feature_init: str, seed: int, final_loss: float, final_accuracy: float, converged: bool, training_time: float, loss_history: List[float] | None = None, accuracy_history: List[float] | None = None) → None¶

class XORBenchmark(config: XORConfig)[source]¶

Bases: object

XOR benchmark runner for Tversky Neural Networks.

__init__(config: XORConfig)[source]¶

run_single_experiment(intersection_method: str, difference_method: str, normalize: bool, feature_count: int, prototype_init: str, feature_init: str, seed: int, track_history: bool = False) → XORResult[source]¶: Run a single XOR training experiment.

run_benchmark(verbose: bool = True, track_history: bool = False) → List[XORResult][source]¶: Run complete benchmark suite.

analyze_results() → Dict[str, float][source]¶: Analyze benchmark results and compute convergence rates.

run_fast_xor_benchmark(verbose: bool = True) → Tuple[List[XORResult], Dict[str, float]][source]¶: Run fast XOR benchmark for development (96 runs, ~60 seconds).

run_full_xor_replication(verbose: bool = True) → Tuple[List[XORResult], Dict[str, float]][source]¶: Run full paper replication (12,960 runs, ~2.2 hours).

XOR benchmark suite for validating non-linear learning capabilities.