40+ fields extracted per invoice
Any language — zero config
~2s average response time
From €0.02/invoice
99%+ OCR confidence
Invoice extraction API · Live

Any invoice.
Structured JSON.
In seconds.

Upload any PDF, scan or photo. Get back vendor, buyer, line items, taxes and payment details — structured, validated, ready to use. No templates. No training. From €0.02 per invoice.

By the numbers
1,865+ invoices processed
99% avg OCR confidence
PDF · JPG · PNG formats supported
EN · ES · FR · DE · IT · NL + any language
Use cases

Built for everyone
who touches invoices.

Integrate invoice parsing in minutes, not months.

One REST endpoint. Multipart form upload. Structured JSON response with confidence scores and field-level warnings. Webhooks included.

No template setup or model training — zero-shot LLM extraction
Batch processing, idempotency keys, webhook callbacks
Vision fallback for low-quality scans and photos
Amount cross-validation — subtotal + tax = total, always checked
example.py
import requests

response = requests.post(
  "https://api.facturax.app/extract",
  headers={"X-API-Key": "fct_..."},
  files={"file": open("inv.pdf", "rb")}
)

data = response.json()
# data["vendor_name"] → "Acme Corp"
# data["total"] → 1452.80
# data["line_items"] → [{...}, ...]
# data["confidence_score"] → 0.97

Stop keying invoices manually. Forever.

Your team spends hours a week entering invoice data into your ERP or accounting software. Parse does it in 2 seconds per invoice, with higher accuracy than manual entry.

Upload via dashboard — no technical knowledge required
Duplicate invoice detection — never pay twice
Amount validation catches OCR errors before they reach your books
Export JSON or integrate directly with your ERP via API
invoice_Q2_batch.json
// 3 invoices processed in 6.2s

"results": [
  { "vendor": "Telefónica S.A.", "total": 243.60, "ok" },
  { "vendor": "AWS EMEA SARL", "total": 1820.44, "ok" },
  { "vendor": "Office Depot", "total": 87.30,
    "warning": "Possible duplicate (INV-2024-0091)" }
]

Upload it. Read it. Done.

Got an invoice you can't read clearly, or need to pull out the totals fast? Upload it, get the data in seconds. No spreadsheets, no squinting at PDFs.

Works with photos from your phone — no scanner needed
Invoices in any language — French, German, Italian, Dutch, English…
2 free extractions to try — no card required
Credits never expire — buy once, use whenever
factura_proveedor_FR.pdf
"vendor_name": "EDF Entreprises",
"vendor_country": "FRA",
"invoice_date": "2026-05-31",
"subtotal": 840.00,
"tax": 168.00, // TVA 20%
"total": 1008.00,
"iban": "FR76 3000 6000 0112 3456 7890 189",
"confidence_score": 0.95
How it works

One POST. Everything you need.

No config, no model training, no field mapping. Send the file, receive structured data.

01 — Upload

Any format, any quality

PDF (native or scanned), JPG, PNG, smartphone photo. Up to 20MB. Multi-page fully supported.

02 — Extract

OCR + LLM pipeline

Tesseract handles the text layer. GPT-4o understands layout, fiscal context and edge cases — even on low-quality scans.

03 — Validate

Structured + checked JSON

Every response includes a confidence_score, amount cross-check and duplicate detection.

Extracted fields

40+ fields. Out of the box.

Every field extracted on every invoice, with no configuration required.

vendor_name
Issuer / seller
vendor_vat
Tax ID / CIF / VAT number
invoice_number
Invoice reference
invoice_date
Issue date (YYYY-MM-DD)
total
Invoice total amount
subtotal
Pre-tax base amount
tax / tax_rate
Tax amount and rate (%)
iban
Payment IBAN
due_date
Payment due date
buyer_name / buyer_vat
Recipient details
vendor_address / city / zip
Full issuer address
line_items[ ]
Description · qty · price · tax
currency
ISO 4217 (EUR, USD, GBP…)
payment_method
Transfer, direct debit, card…
confidence_score
0–1 extraction quality
Core field
Supplementary field
Comparison

Invoices are not generic documents.
We treat them differently.

Feature comparison: parse.facturax vs Mindee vs Google Document AI vs Azure Form Recognizer
Feature parse.facturax Mindee Google Doc AI Azure Form Rec.
LLM-powered (GPT-4o)✗ Custom model
Zero-shot — no templates✗ Training needed✗ Training needed
Full line item extraction✓ All fieldsPartialPartialPartial
Amount cross-validation✓ Built-in
Duplicate detection
Low-quality scan fallback✓ Vision fallbackLimited
Price per invoiceFrom €0.05From $0.08From $0.65From $0.50
Credits never expire
Spanish fiscal stack (Facturae + XAdES)✓ via facturax.app
Pricing

Pay per extraction.
Credits never expire.

Start with 2 free extractions — no card required. Scale as you grow.

Starter
€9,90
15 extractions · one-time
€0.66 / invoice
40+ fields
Any language
REST API
No expiry
Get started
Pro — Most popular
€19,90
40 extractions · one-time
€0.50 / invoice
Everything in Starter
Webhooks
Batch processing
Duplicate detection
Get started
Scale
€39,90
100 extractions · one-time
€0.40 / invoice
Everything in Pro
Priority processing
Usage dashboard
No expiry
Get started
Volume — API-only · Monthly subscription

Built for integrations
that run at scale

This plan is for teams that have already integrated the API and process invoices programmatically every month. You integrate once — and the price reflects that commitment.

API access only · not available via dashboard
First 500 / month€0.06 / invoice
501 – 2,500 / month€0.055 / invoice
2,501 – 10,000 / month€0.05 / invoice
500 invoices/month 10,000 invoices/month
500 invoices/month Total: €20.00/month
Subscribe for €20.00/month →
✓ API key + webhooks ✓ Batch processing ✓ Duplicate detection ✓ Amount validation ✓ Usage dashboard ✓ Cancel anytime
€20
/ month
€0.04/invoice
vs. one-time packs
Save ~€180/month at 500 inv.

Need more than 10,000 invoices/month?

Contact us →

2 free extractions included on every account · No credit card required

FAQ

Common questions

What file formats does the API support?

PDF (native text and scanned), JPG, PNG and smartphone photos up to 20MB. Multi-page PDFs are fully supported — each page is processed and merged into a single response.

What languages can it handle?

Any language. The API has been tested with English, Spanish, French, German, Italian, Dutch and Portuguese, but the LLM handles any script without configuration. Just send the file.

How many fields are extracted?

40+ structured fields per invoice: vendor_name, vendor_vat, vendor_address, iban, invoice_number, invoice_date, due_date, subtotal, tax, total, line_items (description, quantity, unit price, tax rate) and more. Each response also includes a confidence_score and warnings array.

Do credits expire?

No. Credits never expire. You buy a pack once and use the extractions whenever you need them — next week, next year. There's no time pressure to consume them.

How is this different from Mindee or Google Document AI?

parse.facturax.app uses GPT-4o for zero-shot extraction — no template setup or model training. It includes built-in amount cross-validation (subtotal + tax = total, always verified), duplicate invoice detection, and a vision fallback for unreadable scans. Pricing starts 4× lower than Google Document AI.

Is there a sandbox or test mode?

Yes. Add the header X-Sandbox: true to any request to run it without consuming quota. Sandbox responses are structurally identical to live responses so you can test your integration end-to-end.

Where is my invoice data stored?

Invoice files are processed in memory and discarded immediately — they are never stored on disk. Only the extracted JSON result and metadata are stored, associated with your API key. The server runs in Hetzner Frankfurt (EU). See our privacy policy for full details.

Spain flag

Processing Spanish invoices? FacturaX.app adds Facturae XML 3.2.2 generation and XAdES-BES digital signing — the full stack required by Spanish Public Administration and Ley Crea y Crece 2027.

FacturaX.app →