The most productive thing you can do with a risk model is try to break it.
Not stress-test it. Not scenario-test it. Actively try to construct the conditions under which it produces a confident wrong answer, and then check whether those conditions are plausible. Most validation processes do not do this. They ask whether the model works under normal conditions, whether the outputs are consistent with expectations, and whether the design is technically sound. These are necessary questions. They are not sufficient ones.
The intellectual discipline of testing positions by seeking their falsification rather than their confirmation has a long history outside financial services. Karl Popper built a philosophy of science around it: a claim is only meaningful if it can, in principle, be proven wrong. The adversarial tradition in common law is built on a similar presumption: truth is most reliably reached not by presenting the strongest case for a position, but by subjecting that position to the strongest possible challenge and seeing what survives. Science and law both arrived, independently, at the same structural insight. The starting position should be scepticism, and acceptance should require the active defeat of that scepticism.
I call this adversarial empiricism. It is the most useful single intellectual discipline I have found in fifteen years of risk practice, and it is almost systematically absent from the institutional processes it would most improve.
What confirmation bias looks like at institutional scale
Confirmation bias is well-documented at the individual level. The institutional version is less studied and considerably more consequential.
In most institutional review processes, the default orientation is: this model, decision, or framework is probably sound unless we find a specific reason to reject it. The review begins from a position of mild acceptance and looks for reasons to change that position. Evidence that supports the existing view is absorbed naturally. Evidence that challenges it tends to be explained, contextualised, or noted for follow-up at a later date that does not arrive.
This is not a failure of individual judgment. It is a structural feature of how institutional review processes are designed — one that emerged from a reasonable desire for efficiency rather than from any deliberate choice to favour confirmation. The model is submitted for validation. The validation team reviews it. The default outcome, if no specific problem is identified, is approval. The burden of proof sits with rejection, not with acceptance.
The consequence is that institutional confidence in models, frameworks, and decisions tends to outrun the actual evidence for their reliability. Each review that finds no reason to reject something is treated as mild positive evidence. The accumulated weight of these non-rejections creates a picture of soundness that has not actually been tested. When something fails, the institution is surprised. It should not be. The adversarial test was never run.
The falsificationist question
The question that adversarial empiricism requires you to ask is structurally different from the question most reviews begin with.
Instead of: does the evidence support this conclusion? — the question becomes: what would falsify this conclusion, and have I actively looked for it?
In model validation, this means designing tests specifically intended to find the conditions under which the model produces confidently wrong outputs, not just confirming that it produces correct ones under normal conditions. It means identifying the assumptions the model depends on most heavily, and then asking whether those assumptions are actually true rather than simply plausible.
In governance review, it means asking not just whether the framework was followed, but whether the framework would have caught a failure if one had occurred. A governance framework tested only against decisions that turned out correctly is not a tested framework. It is a documented assumption about how the institution would behave under conditions it has not yet faced.
In credit decisions and product risk assessments, it means treating the absence of evidence against a position as weak evidence for it, rather than as confirmation. The absence of a known failure mode is not evidence that no failure mode exists. It may simply mean the right question has not been asked yet.
Why this is harder than it sounds
The falsificationist discipline is genuinely uncomfortable to apply consistently, for reasons that have nothing to do with intellectual capability.
Organisations reward consistency and decisiveness. A risk leader who revises their position frequently, even in response to good evidence, can appear uncertain or unreliable. The institutional incentive structure pushes toward maintaining positions once taken, qualifying challenges rather than absorbing them, and presenting confidence even when the underlying evidence base is thinner than the presentation suggests.
The adversarial empiricist has to operate against this incentive structure. In practice, what this requires is a willingness to treat your most confident conclusions as the ones most worth attacking — not because confidence is wrong, but because untested confidence and tested confidence look identical from the outside, and only one of them is actually reliable. The discipline means genuinely revising a position when a well-constructed challenge holds up under scrutiny, not explaining the challenge away.
This is not the same as being uncertain. It is the opposite of intellectual cowardice. The position that has survived a serious adversarial test is more defensible, not less, than the position that was simply never challenged. An approval from a risk function that applies this discipline is worth more than one that does not, precisely because the approval means the thing actually survived the test.
The structural application
Adversarial empiricism is not a checklist. It is a reorientation of the question that begins every review.
The checklist version asks: have we found any problems? The adversarial version asks: have we tried hard enough to find problems, and are we confident we would have found them if they existed?
The first question is answerable by a standard process. The second requires deliberate design: building review teams that include people whose role is explicitly to challenge rather than to ratify, constructing test scenarios from the perspective of failure rather than success, and treating the failure to find a problem as a reason to look harder rather than a reason to approve.
The institution that builds this into its review processes is not creating friction. It is making a choice about what its approvals actually mean: efficient sign-off, or genuine evidence of soundness. Only one of those is worth the paper it is written on.
I write about governance, risk, and the decisions institutions find hardest to make at asifahmednoor.com. If this is relevant to a problem you are working through, reach me at aan@asifahmednoor.com.
← Back to writing