Underwriting the Out-of-Distribution

Every actuarial method in common use assumes the underlying risk distribution is stationary. The shape of the curve, the parameters that fit it, the tail behavior — all of it presumes that next quarter's exposure will be drawn from the same generative process as last quarter's. This assumption is so foundational that it goes unstated in the textbooks. It is also wrong for every autonomous risk we underwrite.

A model deployed in January is not the same model in April. It has been fine-tuned, prompt-rewritten, given new tools, exposed to new input regimes, and patched in response to incidents that never reached a public docket. The risk surface drifts. Sometimes the drift is announced in a release note. More often it is silent — an emergent property of training-data refresh on a vendor base model the policyholder does not control. Chain-ladder methods cannot see this. Bornhuetter-Ferguson cannot see this. Frequency-severity decomposition cannot see this. They are answering the wrong question.

This memo lays out why the wrong question keeps getting asked, the three specific places prior-art methods break, and the three instruments Castra conditions coverage on instead. It is a working note for underwriters and chief risk officers who already know the actuarial vocabulary and want to understand what changes when the loss distribution refuses to stay put.

Part IThe Stationarity Assumption

Stationarity is the polite word for an extraordinary leap of faith. It says: the process that produced last year's losses is the same process that will produce next year's. Parameters may shift, but the family of distributions does not. Triangles develop. Tails behave. Bornhuetter-Ferguson works because the link ratios are estimable from history. Chain-ladder works because incurred-to-paid factors are stable. A reserve is a fact about the future, conditioned on the past being a fair sample.

For most lines, the assumption is harmless. Auto frequency drifts with miles driven and the deployment of telematics — slow enough that a five-year reserving horizon absorbs the change. Workers' comp severity drifts with medical inflation — tracked quarterly, indexed for. Even cyber, which surprised the industry in 2017, settled into a tractable curve once attack vectors and ransom dynamics stabilized.

Autonomy refuses to settle. The systems we underwrite are upgraded continuously, deployed across input regimes that did not exist when training stopped, and composed with one another in ways the model vendor cannot predict. A retrieval-augmented generation agent that was safe at bind, because it never saw confidential client data, becomes dangerous on Tuesday when a CRM integration silently extends its context window. The actuarial triangle developing on the policy is a record of a system that no longer exists.

Fig. 01 · Drift between true risk and prior-art reserve

True risk Prior-art reserve

Schematic. The true risk surface drifts with model upgrades, integration scope, and input-regime expansion. A reserve anchored at bind develops along an inert link ratio. The gap is the under-reserved exposure carried to renewal.

Part IIThree Failure Modes of Prior Art

Prior-art method fails in three specific places. Each one is recoverable in principle if you concede that you are pricing a process that moves. None of them are recoverable while you still pretend the process holds still.

Failure I — Triangles develop against a system that no longer exists

The first failure is the most embarrassing. A claim incurred in Q1 develops on a triangle whose tail factors were calibrated against a portfolio of model versions, integrations, and prompt scaffolds that no longer represent the insured risk by Q3. The loss develops; the system that caused it has been deprecated. The reserve does not adapt because the actuarial cell was defined by an accident year, not by a model generation.

Castra carries reserves on generation cohorts: every bound risk is tagged with the model identity, the system-prompt hash, the integration manifest, and the eval suite revision in force at the moment of bind. When any of those four change materially, a new cohort opens. We do not develop a Q1-2026 triangle — we develop a triangle scoped to a specific configuration of insured operations. When the configuration retires, the triangle closes; the next cohort starts from a calibrated prior. This is more bookkeeping. It is the only honest bookkeeping.

Failure II — Frequency-severity decomposition assumes claim independence

Decomposing expected loss into frequency multiplied by severity is the actuary's first tool. It presumes claims are independent draws from two stable distributions. For autonomy, neither half holds.

Frequency is not stable: a single upstream model upgrade can convert a portfolio's incident rate by an order of magnitude in either direction overnight. Severity is not independent: when a base model exhibits a drift event — an unintended capability gain, a regression in safety behavior, a latent jailbreak vector exposed by a new evaluation — every policyholder running on that base model is exposed simultaneously. Tens of policies fire at once, on correlated losses, against the same underlying cause. The actuarial decomposition treats these as independent attritional events. They are not attritional. They are catastrophic.

The right unit of risk is not the policy. It is the model.

We track frequency and severity at the model-vendor level, then aggregate to the policyholder. A drift event at a major base-model provider becomes a single named cat for Castra's portfolio reserving — treated the way a hurricane is treated by a Florida homeowner book. The mistake is to amortize the same event across forty E&O policies as if each were a wind-blown shingle. The right unit of risk is not the policy. It is the model.

Failure III — Bornhuetter-Ferguson rests on an a-priori loss ratio that cannot be priced

Bornhuetter-Ferguson is a credible compromise: it averages developed losses against an a-priori expected loss ratio, weighted by how much of the curve has emerged. The method works when the a-priori is anchored to something — industry benchmark, prior accident year, treaty experience. For autonomous risk, the a-priori is fiction. There is no five-year industry curve. There is no prior-year experience for a use case that did not exist eighteen months ago. The reserving model is asked to interpolate from a benchmark that does not exist.

The honest substitute is to anchor the a-priori to evaluation outcomes, not industry experience. A model that passes a specified eval suite at a specified percentile, on a specified set of capability benchmarks, attaches the policy at a quantifiable severity-band conditional probability. The eval becomes the actuarial credibility weight. This is non-trivial — it requires the underwriter to be conversant in capability evaluation, not just questionnaire underwriting — but it is bindable. We have written it. It clears.

Part IIIWhat We Substitute

Once you concede the underlying process drifts, the underwriting question changes. You are no longer pricing a static risk for a one-year period. You are pricing the rate at which the risk can change, the credibility of the signals that flag the change, and the speed at which the policy can re-rate or be cancelled when the change is adverse. Three instruments do most of the work.

Drift-priced premium. The bind premium is not the only premium. Each policy carries a quarterly re-rate trigger conditioned on observed drift in the insured model: capability evaluations, agent behavioral telemetry, integration-scope changes, and incident reports against the policyholder's deployed system. When drift exceeds a banded threshold, the policy re-rates within the next reserve period. The policyholder consents to this at bind; the alternative is a flat-rated policy that gets cancelled mid-term.
Evaluation-anchored attachment. Coverage attaches at a severity threshold defined in eval terms, not dollar terms. A model performance warranty does not pay out when accuracy drops — it pays out when accuracy drops below a contractually-specified benchmark percentile, measured against an audited eval suite the policyholder runs continuously and Castra audits twice a year. Attachment is observable, contestable, and bindable.
Portfolio-level correlation reserving. Reserves are not held only at the policy level. Castra carries a portfolio reserve specifically against base-model correlation risk — the scenario in which a single drift event fires against multiple policyholders on the same vendor model. This reserve is set by a cat methodology, not by attritional triangulation, and is the line item that makes the treaty math defensible to our reinsurance partners.

None of these three are exotic. Each one is a familiar pattern from another specialty line: drift pricing exists in marine cargo, evaluation-anchored attachment exists in clinical trials liability, portfolio-correlation reserving exists in earthquake. The novelty is the combination, and the discipline required to operate all three concurrently for a single bound risk. The discipline is the product.

CodaWhat This Costs the Insured

The shape of an autonomy policy from Castra looks different from a legacy tower. Premium is not a single number; it is a band with a re-rate clause. Attachment is not a dollar figure; it is an evaluation outcome. The policyholder is required to maintain telemetry hooks the underwriter audits. None of this is friction for the sake of friction — each line of the policy answers a specific failure mode of the conventional form.

What the insured gets in exchange is coverage that responds when the loss happens, instead of coverage that is litigated into denial. The cyber tower will not pay. The E&O tower will not pay. The general liability tower will not pay. We will pay. The price of payment is admitting, in writing, that the risk being insured is one whose underlying distribution moves — and pricing it as such.

That admission is the underwriting product. It is also, increasingly, what differentiates a specialty MGA worth working with from a managing general agency that has put a coat of paint on a 2018 policy form. We are not selling certainty. We are selling a structure in which uncertainty is priced honestly and re-priced quarterly. For risks that move, this is the only structure that pays.

Methodology & Notes

Drift schematic (Fig. 01) is illustrative and not calibrated to any specific portfolio. Cohort tagging methodology is derived from Castra's internal underwriting protocol; redacted detail is available under NDA for treaty counterparties. Where this memo references actuarial methods (chain-ladder, Bornhuetter-Ferguson, frequency-severity decomposition) it assumes the reader is familiar with their standard formulations; introductory treatment is outside scope.

This memo is provided for informational purposes only and does not constitute legal or actuarial advice, an offer to insure, or a binder of coverage. Coverage availability and policy terms vary by jurisdiction and risk. Direct inquiries to underwriting@castrarisk.ai.

Underwriting the
out-of-distribution.

Part IThe Stationarity Assumption

Part IIThree Failure Modes of Prior Art

Failure I — Triangles develop against a system that no longer exists

Failure II — Frequency-severity decomposition assumes claim independence

Failure III — Bornhuetter-Ferguson rests on an a-priori loss ratio that cannot be priced

Part IIIWhat We Substitute

CodaWhat This Costs the Insured

Your risk is a moving target.
So is our premium.

More from The Castra Quarterly.

Anatomy of an Autonomous Loss

The EU AI Act and Your Insurance Tower

Modeling Correlated Autonomy Failure

Underwriting the out-of-distribution.

Part IThe Stationarity Assumption

Part IIThree Failure Modes of Prior Art

Failure I — Triangles develop against a system that no longer exists

Failure II — Frequency-severity decomposition assumes claim independence

Failure III — Bornhuetter-Ferguson rests on an a-priori loss ratio that cannot be priced

Part IIIWhat We Substitute

CodaWhat This Costs the Insured

Your risk is a moving target. So is our premium.

More from The Castra Quarterly.

Anatomy of an Autonomous Loss

The EU AI Act and Your Insurance Tower

Modeling Correlated Autonomy Failure

Underwriting the
out-of-distribution.

Your risk is a moving target.
So is our premium.