Chapter 1
Collapse SimulationRun fixed-seed collapse scenarios with tick-level traces and deterministic replay checksums.
The leaderboard rank is not enough. A model must satisfy collapse recovery, decision integrity, fallback reliability, pass-rate, and evidence-depth gates simultaneously.
Chapter 1
Collapse SimulationRun fixed-seed collapse scenarios with tick-level traces and deterministic replay checksums.
Chapter 2
Hard Gate ScoringScore regen, collapse recovery, decision integrity, fallback rate, and pass-rate against strict thresholds.
Chapter 3
Reliability ChecksRun pairwise swap checks, disagreement instability diagnostics, and anti-template filters.
Chapter 4
Public VerdictPublish leaderboard plus `/api/restart-verdict` so anyone can inspect failed gates behind a NO.