Documentation · Document AI

Classification. Know what you got, before you parse it.

One endpoint, 40+ shipped document classes plus your tenant-defined types — with confidence, alt-labels and per-page detection for mixed-content uploads.

← Back to documentation

Overview

Detect the document type, then decide what to do with it.

  • Single-call detection — returns a primary label plus up to 4 alternates with calibrated scores.
  • Per-page mode — a single PDF that's an invoice followed by a packing slip returns two segments.
  • Tenant catalogue — register your own classes; the model learns from 20+ exemplars.
  • Routing-friendly — pair with the extraction template that matches the label and skip the manual triage layer.

Endpoint

POST /v1/classify Sync · ≤100 pages

Request

POST /v1/classify
# curl — single document, per-page detection
curl -X POST https://api.cogneris.ai/v1/classify \
  -H "Authorization: Bearer $COGNERIS_KEY" \
  -F "file=@./inbox-batch.pdf" \
  -F "mode=per_page" \
  -F "max_alternates=3"

Parameters

  • file or source.url — the document to classify.
  • modedocument (default, one verdict) or per_page (segmented).
  • max_alternates — how many runner-up labels to surface (0–4, default 2).
  • catalog — slug of a tenant catalogue; defaults to the shipped global catalogue.

Response

200 OK
{
  "data": {
    "label":       "invoice",
    "confidence":  0.97,
    "alternates": [
      { "label": "purchase_order", "confidence": 0.02 },
      { "label": "credit_note",    "confidence": 0.01 }
    ],
    "segments": [
      { "pages": "1-3", "label": "invoice",       "confidence": 0.97 },
      { "pages": "4-4", "label": "packing_slip",  "confidence": 0.91 }
    ]
  },
  "meta": {
    "job_id":   "cls_01J9MT2P…",
    "model":    "flx-classify-2026-03",
    "pages":    4,
    "audit_url": "https://app.cogneris.ai/audit/cls_01J9MT2P"
  },
  "has_errors": false
}

Supported classes

40+ shipped types across finance, legal, identity and healthcare.

Finance
invoice · credit_note · purchase_order · receipt · bank_statement · payslip
Legal
nda · msa · sow · employment_contract · lease · power_of_attorney
Identity (KYC)
passport · drivers_license · national_id · utility_bill · proof_of_address
Insurance
claim_form · policy_schedule · medical_report · police_report · damage_photo
Logistics
bill_of_lading · packing_slip · customs_declaration · delivery_note
Tax
w2 · w9 · 1099 · vat_return · sales_tax_filing

Full catalogue lives at GET /v1/classify/catalog.

Custom classes

Register a tenant catalogue when none of the shipped labels fit.

  • Bootstrap — 20+ representative samples per class is the floor; 200+ is the sweet spot.
  • Iterate — push corrections via POST /v1/classify/feedback; the next training cycle picks them up.
  • Shadow mode — run a new catalogue alongside production for 1 business day before swapping the alias.

Multi-doc splitting

When mode=per_page returns multiple segments, hand each segment to extraction with the right template — the segments[].pages range is page-indexed and can be sent verbatim as a page_range option to extraction.

Errors

CodeMeaning
400Unknown catalog slug or invalid mode.
409Catalogue still training — retry after the retry_after hint.
422Document unreadable or zero confidence across every label.
429Rate-limited; honour Retry-After.

Next: Ask the PDF

Conversational Q&A with page-level citations.

Read Ask the PDF