Air-gapped document AI in 2026

The question moved to the front of the RFP

The sovereignty conversation has been on enterprise slides for at least three years. What changed in 2026 was the place the conversation appears in the procurement document. A regulated buyer running a request for proposal — a central bank, a defence prime, a national health authority, a tax administration, a transmission operator — used to ask the deployment question at the back of the technical annex, after the feature comparison and the integration matrix. The version we now see on our desk puts three questions at the front, before the feature list:

Can your system run 100% on-prem or air-gapped, with no requirement for the document or the inference call to leave our jurisdiction or our perimeter?
How does the model upgrade when the network is off? What is the procedure, what artefacts arrive, who signs them, and what is the rollback if the new model regresses on our corpus?
Which auditable artefacts ship in the box of the deployment? Cryptographic provenance for the model weights, signed manifests for the prompts and policies, structured trace for every decision, and an export the buyer's auditor can read without our help.

The order is the news. The buyer in 2025 ranked vendors by feature breadth and then asked the deployment question to rule out the obvious failures. The buyer in 2026 ranks vendors by deployment posture and then asks the feature question among the survivors. The vendors who built for SaaS-only delivery and treated on-prem as a roadmap item spent the second half of 2025 reassuring themselves that regulated buyers would eventually accept the cloud answer. They did not. The RFP shape that crossed our desk in the first half of 2026 simply removes those vendors from the evaluation before the demo slot.

The deployment-side detail of the sovereignty contract — residency clauses, BYOK, jurisdiction-bound compute and the 30–60% premium — sits in the sovereign-AI piece. This post is the operational layer underneath: how a vendor actually builds for the air-gapped deployment shape, what changes in the release process, and what the buyer's auditor will read six months later.

Why the question moved — three forces, not one

Air-gapped is rarely about one regulation or one geopolitical event. It moved because three forces compounded inside the same 12 months.

Regulation tightened the residency wording

Multiple jurisdictions published rules in the first half of 2026 that converted "personal data should remain in jurisdiction" from a recommendation to a prescriptive control with a named penalty. The wording matters more than the headline: the new rules name the inference call, not only the training corpus, as a data-processing activity that falls under residency. A buyer who routes documents to an LLM endpoint outside the jurisdiction is now in scope for the residency rule even when the model itself was trained elsewhere. The shortest path to compliance is a deployment where the document never leaves the perimeter.

Geopolitics raised the cost of cross-border dependency

The same 12 months produced enough sanction lists, export controls, pre-release access regimes and antitrust filings to make any cross-border AI dependency a continuous risk, not a one-time review. A regulator that opens a structural proceeding against a frontier provider — covered in the frontier-provider due-diligence piece — does not give the downstream buyer 18 months of notice. The buyer who cannot operate the system without that provider's API in the loop is the buyer reading the remedy document with a deprecation timer on their largest workload. Air-gapped is the architectural insurance against that exposure.

The compute side made on-prem economically viable again

The third force is the quietest. Open-weight models reached production quality inside a tractable footprint — single-rack inference, 8-GPU servers, edge appliances — at the same time that frontier API prices began the post-IPO unit-pricing trajectory described in the compute-moat piece. The crossover does not need to be exact for the decision to flip: a buyer with steady-state workload above a few million pages a month who runs an open-weight model on a small cluster in their own datacentre now matches or beats the cloud API on unit cost, with a residency answer that does not require a clause. The CFO conversation stops being "cloud is cheaper" and starts being "what is the landed cost across the contract life, including the sovereignty premium and the audit cost we no longer have to carry."

The frontier model is rented. The deployment is owned. The buyer who confused the two is the buyer whose compliance officer asks, six months from now, where the documents went.

The dual release cadence regulated procurement now expects

The hardest single thing for a SaaS-native vendor to absorb about the air-gapped shape is the release cadence. SaaS shipped weekly, sometimes daily; the air-gapped deployment cannot. The reality that emerged in 2026, after several public-sector procurement cycles, is a dual release cadence — two release trains running in parallel, with a defined gap between them, and a contract that names both.

Release train	What it covers	Cadence	Gap to on-prem
SaaS / managed	Hosted tenants on the vendor's cloud; features land first, telemetry is rich, the path to ship is short.	Weekly minor, monthly minor.	—
On-prem / air-gapped	Tenant-installed instance; offline model bundle, signed manifests, structured release notes, regression report against a reference corpus.	Quarterly major, 4–6 weekly hotfixes.	60–120 days behind SaaS.

The 60–120 day gap is not laziness. It is the time the vendor needs to take a feature that passed the SaaS evaluation set and run it against the additional gates the air-gapped buyer paid for: a reference-corpus regression report from a buyer's-data-shape evaluation set, an archived model card, a signed bundle of weights and prompts with a cryptographic manifest, an installer rehearsal in a reference rack, and a structured release note an auditor can read without the vendor in the room. The buyer who signs an air-gapped contract without naming the gap in the SLA is the buyer who discovers, six months later, that "feature parity" is a phrase the vendor uses differently in two conversations.

Three operational moves let the gap stay civil:

Same trunk, two release artefacts. The SaaS train and the on-prem train build from the same source tree; the difference is the artefact at the end. A vendor who runs two trunks builds two products and ships neither well. The on-prem train fans out from the SaaS commit when the bundle is cut, with the additional gates applied between the cut and the signed bundle.

Feature flags on the SaaS side, version pinning on the on-prem side. SaaS tenants can ride a feature flag; an air-gapped tenant cannot — the network they would phone home through is not there. The on-prem behaviour of a feature is decided by the version pinned in the deployed bundle. Behaviour drift between SaaS and on-prem is the most common cause of an air-gapped audit finding; the discipline that prevents it is "if it is configurable on SaaS, it is signed on the bundle on-prem".

A reference corpus the buyer can replicate. Every air-gapped release ships with a regression report against a reference corpus the buyer can re-run inside their own perimeter — same prompts, same evaluation set, same scoring function. The buyer's auditor does not have to trust the vendor's number; they re-run the report. The cost of building the reference corpus is modest; the cost of not having one shows up in renewal season when the buyer asks for evidence the new model did not regress on their casework and the vendor does not have a number on the buyer's own data.

How a model upgrade actually happens without the internet

The single procurement question that exposes the SaaS-native vendor's gap is the upgrade procedure. "How do you patch a model in our datacentre when the network is off?" The answer is either an architecture or a sentence; the buyer can tell which they got within two minutes.

The shape that ships looks like this:

1. The bundle is the unit of release

A model upgrade is not a binary delta or a weight diff the buyer applies in place. It is a self-contained bundle: the model weights, the tokeniser, the prompts, the policy files, the evaluation harness, the regression report, the installer script, the changelog, the model card, and the cryptographic manifest that signs the whole. The bundle sits on physical media or on a one-way ingress lane the buyer's security team approves. The buyer mounts the bundle, the installer verifies the signature against the vendor's offline-pinned public key, and the rest of the procedure runs on the buyer's compute, never reaching the vendor.

2. The registry on-prem is the source of truth

Inside the buyer's perimeter the vendor's appliance runs a small model registry — a local catalogue of every bundle the appliance has ever loaded, with timestamps, signatures, regression scores and the operator who promoted each one. The registry is the artefact that survives the upgrade. A rollback is a single command against the registry; a forensic question six months later — "what model produced this decision on this document on this date" — is one lookup against the same registry. The vendor that ships an appliance without a registry ships a deployment that cannot be audited at the granularity the regulated buyer will eventually need.

3. The promotion is a four-step gate, not a button

A new bundle does not become the default the moment the buyer mounts it. It becomes the default after a four-step gate the appliance enforces: the bundle signature must verify; the regression report must clear the buyer's named thresholds on the buyer's reference corpus; the operator promoting the bundle must hold the role; and the previous bundle must remain installed for a defined rollback window. Each step writes an event the auditor will read. The appliance refuses to promote a bundle that fails any one, and the refusal is itself an audit event.

4. Telemetry that does not phone home

The hardest engineering discipline in an air-gapped appliance is the telemetry boundary. The temptation to beacon "anonymous diagnostics" home is large; the cost of getting caught is the contract. The shape that passes diligence is the opposite: the appliance produces a structured telemetry archive — usage, errors, latency, per-tenant cost, model performance — that the buyer exports on a cadence the buyer chooses, with a one-way boundary the buyer's network team controls. The vendor gets the archive only if the buyer hands it over. Most regulated buyers will hand it over once the appliance has been in production long enough for the trust to compound; none will hand it over while the appliance is also calling home in the background. The tracing piece covers the structure of the trace the archive carries; the air-gapped delta is the boundary, not the content.

The signed evidence package — what ships in the box

The third question on the RFP — which auditable artefacts arrive — is where the procurement conversation shifts from architecture to paperwork. The answer that survives diligence is a named package, not a promise. The shape that crossed our desk in the first half of 2026 is seven artefacts, each with a defined format and a defined owner.

Artefact	What it carries	Who reads it
Signed model bundle	Weights, tokeniser, prompts, policy files, evaluation harness — cryptographically signed by the vendor with an offline-pinned key.	The buyer's security team verifies the signature; the appliance verifies it again on every load.
Model card	Lineage, training data class, known limitations, jurisdictional posture, pre-release regulatory disclosures.	The buyer's compliance officer and the auditor in the annual review.
Regression report	Performance on the buyer's reference corpus against the previous bundle, on the same metrics, with the deltas and the failure cases.	The buyer's operations lead before promotion, the auditor after.
Structured release note	Machine-readable changelog with the prompts, policies and thresholds that moved, plus the rationale for each.	The buyer's risk team during change advisory; the auditor on the next file.
Per-tenant trace export	For every decision in the period, the inputs, the trace steps, the prompts version, the model version, the operator (if any), and the outcome.	The buyer's auditor in any matter that revisits a past decision.
Cryptographic provenance trail	Hash chain linking every promoted bundle, every prompt version and every policy change to the operator who signed it.	The buyer's forensic team if a decision is challenged after the fact.
Installer rehearsal record	Evidence that the installer was rehearsed on a reference rack matching the buyer's environment before the bundle was cut.	The buyer's infra team during change advisory; the auditor if an upgrade fails.

The package is what makes the air-gapped deployment defensible months later. The buyer who signs without the package has bought an isolated appliance and an act of faith. The buyer who signs with the package has bought an isolated appliance and the evidence trail that survives an auditor's challenge — the same trail covered, on the cloud side, in the AI governance and audit-evidence piece; the air-gapped delta is that the trail lives entirely on the buyer's side of the perimeter.

The four anti-patterns that disqualify before the technical review

The procurement conversations we sit in are mostly civil, but they end quickly when one of four anti-patterns shows up in the vendor's answer. None of them are stated; all of them are inferred from a single question that lands wrong.

"Air-gapped is on the roadmap." A SaaS-native vendor under pressure to answer the deployment question sometimes promises an air-gapped variant in two quarters. The buyer reads two quarters as "this vendor does not have an answer today and will not have one before the contract starts." For deals above roughly US$ 500k ARR in regulated sectors, the buyer's procurement team has explicit instructions to not accept roadmap answers on the deployment posture. The vendor that wants a chance to compete ships the air-gapped variant first and writes about the SaaS variant second.

Telemetry that "phones home for diagnostics". Any sentence in the vendor's architecture deck that describes background telemetry to the vendor's infrastructure disqualifies the vendor in roughly half of the regulated RFPs we see. The fix is the boundary described above: the telemetry is structured, the export is the buyer's decision, the cadence is the buyer's cadence, the destination is the buyer's archive. The vendor's analytics dashboard is fed from what the buyer exports, not from what the appliance beacons. The diagnostics use case can be served entirely by a buyer-driven export; the buyer-driven export is the artefact that keeps the contract intact.

"We bring our own model" — without naming which one. The model that runs in the air-gapped deployment is a contractual object. The buyer wants to know which family, which size, which licence, which jurisdiction of training, which path to upgrade, and which fallback when the model is deprecated. A vendor who answers "we use the best model for the job" is a vendor who has not built the bundle described above. Multi-vendor model portability — covered in the provider-risk piece — is the buyer's hedge on the SaaS side; on the air-gapped side the equivalent is a named model, a named upgrade path, and a named alternative for the day the chosen family is deprecated.

Audit trail as a folder of logs. A vendor who answers the auditable-artefacts question with "we keep detailed logs" is a vendor who has not built the structured trail the buyer's auditor will actually read. Logs are not evidence. The evidence is the structured trace per decision, the signed bundle on the registry, the regression report on the reference corpus, and the cryptographic provenance trail tying it all together. The non-deterministic shape of the underlying model makes the unstructured log even less useful than it was on deterministic systems — covered in the non-deterministic audit-trail piece; the air-gapped delta is that the structure has to ship inside the deployment, not be reconstructible from a cloud-side record.

The economics — and why the premium holds

The 30–60% premium on the air-gapped deployment is the number a regulated buyer can defend to their CFO and the number a vendor can defend to their board. The premium exists because three concrete cost lines exist on top of the SaaS price.

The vendor carries the cost of the second release train, with the additional gates between the SaaS cut and the signed bundle. The vendor carries the cost of an installer rehearsal capability — a reference rack, a rehearsal cadence and the engineering hours to keep the installer current. The vendor carries the cost of an evidence-package engineering line — structured release notes, regression reports against the buyer's reference corpus, signed manifests, and the registry on the appliance side. None of these are dominant in isolation; together they are sizeable, and the buyer who tries to negotiate the sovereignty premium to zero is the buyer who quietly signs up for a vendor cutting corners on one of the three.

The buyer carries cost on their side too — the operations team that runs the appliance, the change advisory that approves bundles, the compute footprint that runs inference at a steady-state cost the cloud bill would have smoothed. The CFO conversation that holds is the landed-cost conversation across the contract life, including the sovereignty premium, the buyer's operational cost, the audit-cost reduction relative to running the same workload through a cross-border SaaS boundary, and the optionality value of not depending on a provider whose roadmap moves under antitrust or geopolitical pressure. In the regulated segments — banking, defence, health, fiscal, energy — the landed-cost case for air-gapped is closing roughly 30–60% above the SaaS price with a defensible margin on the buyer's side. In the rest of the market it is not.

What this means for document AI specifically

Document AI is one of the workloads where the air-gapped shape has the cleanest business case. The volumes are high enough to amortise the on-prem compute, the workload is steady-state enough to make the utilisation curve favourable, and the documents are sensitive enough that the residency argument carries on its own. Three concrete consequences are already visible in our 2026 buyer conversations.

The RFP first-filter is the air-gapped answer. The same RFP that asked about extraction accuracy and per-page price in 2024 asks about on-prem deployment, the upgrade procedure and the evidence package in 2026. The accuracy and price questions are still in the document; they are no longer the first filter. We answer the first-filter questions in writing, we ship the bundle spec and the regression report against the buyer's reference corpus, and we run the upgrade rehearsal on a non-production slice before the contract is signed. The buyer's procurement team is reading for two things — does the vendor have an answer today, and does the evidence survive the security review.

The bundle is the product. A document AI platform that ships only as a hosted endpoint is a platform that will lose the regulated segment in the next 18 months. The bundle — model, prompts, policies, evaluation harness, installer, registry — is the unit the buyer pays for. The bundle is also the unit the vendor's product team optimises: a feature that ships on SaaS but cannot be packaged into the bundle is a feature the regulated segment will not see, and over time that is the feature that does not survive in the codebase either.

Evidence outlives the deployment. The evidence package is the artefact that survives a vendor change, a model deprecation and a regulatory shift. The buyer who is asked, four years from now, why a document was classified the way it was, why a payment was authorised, why a claim was paid will not be served by "the model decided"; they will be served by the per-tenant trace export, the structured release note, the bundle signature and the provenance trail. Cogneris ships those artefacts as part of the deployment, not as an export-on- request, because the deployment is the contract and the contract is what gets audited.

A 90-day plan — buyer side and vendor side

The work to go from "we should consider air-gapped" to "we have a defensible posture" fits inside a quarter on either side of the conversation. The deliverables are documents and a rehearsal, not a year-long programme.

For the regulated buyer

Days 1–30: the deployment-posture review. Take every active AI workload above a defined sensitivity threshold and mark each one against three columns — SaaS-only, on-prem capable but running on SaaS, on-prem deployed. For the SaaS-only and SaaS-deployed columns, record the contractual residency posture, the sub-processor map and the next regulatory event the team is tracking in the relevant jurisdiction. The output is a one-page concentration view the audit committee can read in a single meeting.

Days 31–60: the RFP refresh. Rewrite the deployment-posture section of the standard procurement template to put the three first-filter questions at the top, with named artefact requirements — bundle signature scheme, regression-report format, evidence-package contents, telemetry boundary, upgrade procedure. Apply the refreshed template to one active procurement to find out which vendors clear the bar and which do not. The result is the calibrated version of the template the team will use on every subsequent regulated RFP.

Days 61–90: the rehearsal. Pick one workload where the air-gapped case is strongest and run a full bundle-and-upgrade rehearsal on a non-production slice — install the bundle, verify the signature, run the regression report against the reference corpus, promote the bundle, roll it back, export the trace archive. The rehearsal is the artefact the team takes to the next steering meeting. It is also the artefact that proves the contracted procedure works before the contract depends on it.

For the document AI vendor

Days 1–30: the bundle definition. Name the bundle. Decide what is in it, what is out of it, which key signs it and how the buyer verifies the signature. Write the spec in language a buyer's security team will accept without the engineering team in the room. The output is a one-page bundle definition that is defensible in a diligence conversation.

Days 31–60: the second release train. Pull the trunk that ships SaaS and build the additional gates that produce the signed bundle — the regression harness against a reference corpus, the model card, the structured release note, the installer rehearsal on a reference rack. The gates do not have to be fully automated; they have to exist and be repeatable enough that a release engineer can run them in a quarter. The output is the first signed bundle, with all the artefacts, ready to ship.

Days 61–90: the deployment rehearsal at a buyer. Pick a friendly regulated buyer and run the deployment end-to-end on a non-production environment in the buyer's perimeter. Install the bundle, verify the signature on the buyer's hardware, run the regression report against the buyer's reference corpus, exercise the upgrade and the rollback, export the telemetry archive on the buyer's cadence. The buyer's operations team writes a one-page report that is the proof the vendor can put on the next RFP cover page. The rehearsal is the evidence that the bundle spec, the second train and the evidence package all hold together — in the only place that matters, which is the buyer's data centre under audit conditions.

Closing thought

The buyer's question moved one page. The vendor's engineering moved several quarters. The 2024 conversation was about whether the model was good enough; the 2026 conversation is about whether the deployment is shaped to survive a regulatory event, a geopolitical event and a vendor event, all of which are now baseline operating conditions for any document AI workload that crosses a regulated buyer's desk. Air-gapped is not the only deployment shape that ships; it is the deployment shape that the regulated buyer reads first.

At Cogneris we build document AI as a bundle — model, prompts, policies, evaluation harness, installer and registry — with a signed evidence package as part of every release, because the deployment shape we ship in the regulated segment is the one the auditor will read six months later. If you are sizing the air-gapped side of your document AI programme, see our product page, the trust pillar, or talk to our team. The model is what the press writes about; the bundle is what the regulated buyer signs for.

Air-gapped is the new RFP filter.