Audit by recomputation: a model-risk reader’s guide to calibration evidence
SR 11-7 effectively asks you to defend a model with evidence a validator can independently check. Calibration is the cleanest evidence there is — here’s what to look for, and what to be skeptical of.
Model-risk management has a structural problem with AI: the thing you’re asked to validate is fluent, confident, and opaque. SR 11-7 doesn’t care how impressive it sounds — it asks for evidence a validator can independently check. Calibration evidence is the cleanest way to give it that. Here’s a practitioner’s read.
What calibration evidence actually is
Strip away the branding and it’s simple: the model’s probabilistic outputs, scored against what actually happened, on a stated horizon, with a proper scoring rule (Brier). It answers the one question a validator most needs answered — when this model expresses confidence, is that confidence earned? — in a number rather than a narrative.
Why recomputation beats a memo
A validation memo is an assertion with a letterhead. The stronger artifact is one your team re-derives: hash-locked inputs, a published method, and a result that reproduces bit-for-bit when you run it yourself. The trust then comes from the arithmetic, not from the author. Audit by recomputation is the whole idea — and it’s strictly more defensible in front of a regulator than audit by trust.
Three honesty rules to insist on
- Anti-issuer-pay. If the rated party funds the rating, the evidence is compromised at the source.
- Error-published. The misses are on the record next to the hits. A clean-looking record with no published errors is a red flag, not a green one.
- Method-hash stable. The scoring rules are fixed and hashed, so they can’t quietly change between runs to flatter a result.
What a Calibration Attestation contains
Concretely: a Merkle-chained evidence file over the scored decisions, plus a cover memo that maps the evidence to SR 11-7 language so it drops into your existing validation workflow. Fixed scope ($2,500–$7,500), so it’s a line item, not an open-ended engagement. The attestation page has the detail.
The caveat we’ll state plainly
Today the signature on the evidence is our own (HMAC-SHA256). That makes it tamper-evident and fully recomputable by you — but it is not, yet, countersigned by an independent third party; that anchor is in progress. We’d rather you know exactly what you’re relying on. For a model-risk reader, the recomputability is the part that does the work anyway: you don’t have to trust the signer if you can re-derive the result.
If it helps to see the philosophy before the paperwork, the attest-versus-assert breakdown is the fastest version, and the public AI Trust Index is the same standard applied to frontier models, in the open.