Skip to content
ParseFlow

Supported Formats

ParseFlow supports a wide range of file formats and document types. Here is everything you need to know about what we can process.

6+
File Formats
5
Document Types
50+
OCR Languages
100MB
Max File Size

File Formats

Each format has different capabilities and plan requirements.

PDF

PDF (Native Text)

Full Support

.pdf

Native PDF files with embedded text. Fastest processing, highest accuracy. All plans.

View example JSON output
{
  "documentType": "invoice",
  "confidence": 0.95,
  "data": {
    "invoiceNumber": "INV-2026-0142",
    "vendor": { "name": "Acme Corp" },
    "total": 8118.75,
    "currency": "USD"
  }
}

PDF (Scanned / Image-based)

Pro Plan (OCR)

.pdf

Scanned documents saved as PDF. Requires OCR processing. Available on Pro and Enterprise plans.

View example JSON output
{
  "documentType": "receipt",
  "confidence": 0.87,
  "data": {
    "merchant": "Coffee Shop",
    "items": [{ "name": "Latte", "price": 4.50 }],
    "total": 4.50,
    "ocrApplied": true
  }
}

JPEG / JPG

Pro Plan (OCR)

.jpg, .jpeg

Photos of documents, scanned invoices, receipts. OCR extracts text from the image before parsing.

View example JSON output
{
  "documentType": "id_document",
  "confidence": 0.89,
  "data": {
    "type": "passport",
    "fullName": "John Smith",
    "documentNumber": "AB1234567",
    "expiryDate": "2030-12-31",
    "ocrApplied": true
  }
}

PNG

Pro Plan (OCR)

.png

PNG images of documents. Supports transparency. Best for screenshots and digital scans.

View example JSON output
{
  "documentType": "bank_statement",
  "confidence": 0.85,
  "data": {
    "accountHolder": "Jane Doe",
    "bankName": "First National",
    "transactions": [
      { "date": "2026-03-01", "description": "Direct Deposit", "amount": 3500.00 }
    ],
    "ocrApplied": true
  }
}

WebP

Pro Plan (OCR)

.webp

Modern image format with smaller file sizes. Great for web-captured document images.

View example JSON output
{
  "documentType": "receipt",
  "confidence": 0.88,
  "data": {
    "merchant": "Electronics Store",
    "total": 299.99,
    "paymentMethod": "Credit Card",
    "ocrApplied": true
  }
}

TIFF

Pro Plan (OCR)

.tiff, .tif

High-quality scanned documents. Common in enterprise and legal environments.

View example JSON output
{
  "documentType": "contract",
  "confidence": 0.83,
  "data": {
    "parties": ["Acme Corp", "Widget Inc"],
    "effectiveDate": "2026-01-15",
    "ocrApplied": true
  }
}

CSV / Excel

Coming Soon

.csv, .xlsx, .xls

Spreadsheet files with structured data. Auto-detect column headers and extract structured rows.

View example JSON output
{
  "documentType": "spreadsheet",
  "status": "coming_soon",
  "data": {
    "headers": ["Date", "Description", "Amount"],
    "rows": 150,
    "sheets": 1
  }
}

Document Types

ParseFlow auto-detects the document type, or you can specify it via the document_type parameter.

Invoice

invoice
invoiceNumberinvoiceDatedueDatevendorlineItemssubtotaltaxAmounttotalcurrency

Receipt

receipt
merchantdateitemssubtotaltaxtotalpaymentMethodcurrency

Contract

contract
partieseffectiveDateexpirationDatetermsobligationsgoverningLawsignatures

ID Document

id_document
typefullNamedateOfBirthdocumentNumberissuingCountryexpiryDatenationality

Bank Statement

bank_statement
accountHolderbankNameaccountNumberstatementPeriodopeningBalanceclosingBalancetransactions

Need OCR for Images?

Scanned PDFs and images (JPG, PNG, WebP, TIFF) require Optical Character Recognition. Our OCR engine supports 50+ languages with high accuracy, available on Pro and Enterprise plans.

  • 50+ language support with automatic detection
  • Handles skewed, rotated, and low-quality scans
  • Multi-page document support
  • Confidence scores per field for quality assurance

Ready to start processing documents?

We use cookies to improve your experience and analyze site traffic. See our Privacy Policy for details.