When private cloud document AI matters
Most document AI pilots can run in a standard cloud tenant. Regulated production workflows often need more: dedicated infrastructure, VPC peering, customer-managed keys, region pinning, strict data retention, zero data retention for source documents, and audit evidence that proves where every inference ran.
Private VPC
Deploy the document processing API in a dedicated network boundary with tenant-scoped ingress, egress, logs, and secrets.
Retention controls
Set document data retention by workflow, including zero data retention beyond request processing when the use case requires it.
Audit evidence
Export model version, region, reviewer, validation, and webhook evidence for compliance review.
Deployment options
| Option | Best fit | Controls |
|---|---|---|
| Shared cloud tenant | Standard API and portal workflows | Tenant isolation, encryption, regional residency, configurable retention |
| Dedicated tenant | Enterprise procurement and higher-volume workflows | Dedicated keys, rate limits, stricter access controls, custom DPA terms |
| Private VPC deployment | Banks, insurers, healthcare, government-adjacent, and regulated SaaS platforms | Private networking, BYOK/CMEK, pinned sub-processors, zero data retention mode |
| Controlled single-tenant cloud | Customers that need operational isolation without full on-prem ownership | Single-tenant compute, regional processing, customer-specific observability boundaries |
Zero data retention and audit retention are different
Zero data retention means source documents and extracted content can be discarded after request processing. Audit metadata is a separate record: request ID, timestamps, model version, validation status, reviewer activity, and export decisions can be retained under your contract so compliance teams can still reconstruct what happened.
Private cloud evaluation checklist
Ask every document AI vendor where inference runs, whether embeddings leave the selected region, whether document bodies appear in logs, whether LLM sub-processors use zero-retention APIs, how BYOK revocation works, and whether the audit trail records deployment region and model version per document.