Supported Formats
ParseFlow supports a wide range of file formats and document types. Here is everything you need to know about what we can process.
File Formats
Each format has different capabilities and plan requirements.
PDF (Native Text)
Full SupportNative PDF files with embedded text. Fastest processing, highest accuracy. All plans.
View example JSON output
{
"documentType": "invoice",
"confidence": 0.95,
"data": {
"invoiceNumber": "INV-2026-0142",
"vendor": { "name": "Acme Corp" },
"total": 8118.75,
"currency": "USD"
}
}PDF (Scanned / Image-based)
Pro Plan (OCR)Scanned documents saved as PDF. Requires OCR processing. Available on Pro and Enterprise plans.
View example JSON output
{
"documentType": "receipt",
"confidence": 0.87,
"data": {
"merchant": "Coffee Shop",
"items": [{ "name": "Latte", "price": 4.50 }],
"total": 4.50,
"ocrApplied": true
}
}JPEG / JPG
Pro Plan (OCR).jpg, .jpeg
Photos of documents, scanned invoices, receipts. OCR extracts text from the image before parsing.
View example JSON output
{
"documentType": "id_document",
"confidence": 0.89,
"data": {
"type": "passport",
"fullName": "John Smith",
"documentNumber": "AB1234567",
"expiryDate": "2030-12-31",
"ocrApplied": true
}
}PNG
Pro Plan (OCR).png
PNG images of documents. Supports transparency. Best for screenshots and digital scans.
View example JSON output
{
"documentType": "bank_statement",
"confidence": 0.85,
"data": {
"accountHolder": "Jane Doe",
"bankName": "First National",
"transactions": [
{ "date": "2026-03-01", "description": "Direct Deposit", "amount": 3500.00 }
],
"ocrApplied": true
}
}WebP
Pro Plan (OCR).webp
Modern image format with smaller file sizes. Great for web-captured document images.
View example JSON output
{
"documentType": "receipt",
"confidence": 0.88,
"data": {
"merchant": "Electronics Store",
"total": 299.99,
"paymentMethod": "Credit Card",
"ocrApplied": true
}
}TIFF
Pro Plan (OCR).tiff, .tif
High-quality scanned documents. Common in enterprise and legal environments.
View example JSON output
{
"documentType": "contract",
"confidence": 0.83,
"data": {
"parties": ["Acme Corp", "Widget Inc"],
"effectiveDate": "2026-01-15",
"ocrApplied": true
}
}CSV / Excel
Coming Soon.csv, .xlsx, .xls
Spreadsheet files with structured data. Auto-detect column headers and extract structured rows.
View example JSON output
{
"documentType": "spreadsheet",
"status": "coming_soon",
"data": {
"headers": ["Date", "Description", "Amount"],
"rows": 150,
"sheets": 1
}
}Document Types
ParseFlow auto-detects the document type, or you can specify it via the document_type parameter.
Invoice
invoiceReceipt
receiptContract
contractID Document
id_documentBank Statement
bank_statementNeed OCR for Images?
Scanned PDFs and images (JPG, PNG, WebP, TIFF) require Optical Character Recognition. Our OCR engine supports 50+ languages with high accuracy, available on Pro and Enterprise plans.
- 50+ language support with automatic detection
- Handles skewed, rotated, and low-quality scans
- Multi-page document support
- Confidence scores per field for quality assurance
Ready to start processing documents?