Why invoices are hard
Every vendor designs their invoice differently. The header might call the customer "Bill To", "Sold To", or "Customer"; the line-item table can have 4 columns or 12; tax can be a single line, a per-row column, or a separate breakdown. Add EU VAT, US sales tax, GST, and reverse-charge invoices and the format space explodes. OCR-only parsers hit 80–85% line-item accuracy and miss the table boundaries on multi-page documents — which means AP teams spend their time fixing extraction errors, not deciding what to pay.
How Cogneris does it
Cogneris parses invoices semantically, not by template. The classifier recognizes invoice variants (standard, credit memo, recurring, prepayment, statement, EU VAT-style, freight bill); the extractor walks the document to find header, line items, taxes, totals, and payment terms; the validator runs arithmetic checks (line-item sum + tax = total) and surfaces discrepancies before the data hits your ERP. Multi-page invoices are stitched automatically; bundled vendor statements are split into individual invoices.
Sample extraction output
What you get out of the box
Every invoice format
PDF, scan, photo, email attachment, EDI 810. US, EU VAT, UK, GST, multi-language. Standard, credit memo, recurring, prepayment, statement.
Line-item confidence
Each line returned with confidence, source page, and bounding box. Low-confidence rows route to AP review; the rest auto-post.
Three-way match
Pass PO and GRN with the invoice; Cogneris returns the match decision and any line-level discrepancies with configurable tolerance bands.
GL-code suggestion
Vendor + line description + history maps to your chart of accounts. Suggestions ship with confidence; AP keeps final approval.
Invoice fields and JSON shape
Veryfi and Mindee-style buyer searches usually ask which fields are returned. Cogneris returns the standard AP fields plus workflow fields: vendor identity, remittance data, invoice numbers, purchase orders, tax details, line items, duplicate status, match status, validation errors, citations, and reviewer state.
{
"document_type": "invoice",
"fields": {
"vendor_name": { "value": "Acme Logistics LLC", "confidence": 0.99 },
"invoice_number": { "value": "ACL-2026-04812", "confidence": 0.98 },
"po_number": { "value": "PO-58213", "confidence": 0.96 },
"total": { "value": 19985.70, "currency": "USD", "confidence": 0.99 }
},
"line_items": [{ "description": "Freight", "quantity": 12, "amount": 18420.00 }],
"validation": { "status": "passed", "three_way_match": "partial_match" }
}
SDK links for developers
Start with the Node.js SDK, Python SDK, or REST API reference. For platform teams, the same invoice schema can be used in a document portal, monitored inbox, or AP workflow webhook.
Integration patterns
Cogneris integrates directly with the ERPs AP teams actually use. NetSuite — bi-directional sync; vendor records, GL accounts, and cost centers pulled in; approved invoices pushed as Vendor Bills with attachments. SAP S/4HANA — IDOC INVOIC02 mapping, master-data lookups against Business Partner. Oracle Cloud ERP, Workday Financials, QuickBooks Online, Xero — pre-built connectors with field-mapping templates. Custom ERPs — drop normalized JSON into the REST API or use webhooks for event-driven flows.
Compliance & trust
Invoices contain vendor banking details, customer information, and tax identifiers. Cogneris masks bank-account numbers in audit metadata by default, retains documents encrypted at rest with per-tenant keys, and offers configurable retention from 0 to 7 years to meet local tax-law obligations. See our trust page for the full posture: encryption, tenant isolation, sub-processors, GDPR DPA, CCPA, SOC 2 Type II in progress, and HIPAA BAA on Enterprise.
Get started
Pay-per-page pricing means you can start an evaluation today without an annual commit. Most teams ship their first invoice extraction into production within a week and reach steady-state accuracy on their vendor mix in under 30 days.
Related extractors
Cogneris extracts dozens of structured document types. The closest neighbors to invoice extraction:
- Receipt extraction — itemized receipts for expense reports, T&E reconciliation, and reimbursement workflows.
- Contract extraction — parties, term, governing law, liability caps, and named clauses from MSAs, NDAs, and SaaS agreements.
- Bank statement extraction — cash-flow parsing and transaction-level extraction from US and Canadian statement PDFs.
For broader context, see the IDP buyer's guide, the 2026 State of Document AI report, or estimate ROI at your volume.