Model risk·For model-risk teams·6 min read

Audit by recomputation: a model-risk reader’s guide to calibration evidence

SR 11-7 effectively asks you to defend a model with evidence a validator can independently check. Calibration is the cleanest evidence there is — here’s what to look for, and what to be skeptical of.

By AEQUARA · June 24, 2026

Model-risk management has a structural problem with AI: the thing you’re asked to validate is fluent, confident, and opaque. SR 11-7 doesn’t care how impressive it sounds — it asks for evidence a validator can independently check. Calibration evidence is the cleanest way to give it that. Here’s a practitioner’s read.

What calibration evidence actually is

Strip away the branding and it’s simple: the model’s probabilistic outputs, scored against what actually happened, on a stated horizon, with a proper scoring rule (Brier). It answers the one question a validator most needs answered — when this model expresses confidence, is that confidence earned? — in a number rather than a narrative.

Why recomputation beats a memo

A validation memo is an assertion with a letterhead. The stronger artifact is one your team re-derives: hash-locked inputs, a published method, and a result that reproduces bit-for-bit when you run it yourself. The trust then comes from the arithmetic, not from the author. Audit by recomputation is the whole idea — and it’s strictly more defensible in front of a regulator than audit by trust.

Three honesty rules to insist on

Anti-issuer-pay. If the rated party funds the rating, the evidence is compromised at the source.
Error-published. The misses are on the record next to the hits. A clean-looking record with no published errors is a red flag, not a green one.
Method-hash stable. The scoring rules are fixed and hashed, so they can’t quietly change between runs to flatter a result.

What a Calibration Attestation contains

Concretely: a Merkle-chained evidence file over the scored decisions, plus a cover memo that maps the evidence to SR 11-7 language so it drops into your existing validation workflow. Fixed scope ($2,500–$7,500), so it’s a line item, not an open-ended engagement. The attestation page has the detail.

The caveat we’ll state plainly

Today the signature on the evidence is our own (HMAC-SHA256). That makes it tamper-evident and fully recomputable by you — but it is not, yet, countersigned by an independent third party; that anchor is in progress. We’d rather you know exactly what you’re relying on. For a model-risk reader, the recomputability is the part that does the work anyway: you don’t have to trust the signer if you can re-derive the result.

If it helps to see the philosophy before the paperwork, the attest-versus-assert breakdown is the fastest version, and the public AI Trust Index is the same standard applied to frontier models, in the open.

Keep reading

Calibration

Confident is not correct: what calibration is, and why it matters for any AI you rely on

Use cases

Five moments to reach for a calibrated tool — and what it actually does for you

What calibration evidence actually is

Why recomputation beats a memo

Three honesty rules to insist on

Anti-issuer-pay. If the rated party funds the rating, the evidence is compromised at the source.

Error-published. The misses are on the record next to the hits. A clean-looking record with no published errors is a red flag, not a green one.

Method-hash stable. The scoring rules are fixed and hashed, so they can’t quietly change between runs to flatter a result.

What a Calibration Attestation contains

The caveat we’ll state plainly