We use cookies to improve your experience and analyze site traffic. See our Privacy Policy for details.
ParseFlow supports a wide range of file formats and document types. Here is everything you need to know about what we can process.
Each format has different capabilities and plan requirements.
Native PDF files with embedded text. Fastest processing, highest accuracy. All plans.
{
"documentType": "invoice",
"confidence": 0.95,
"data": {
"invoiceNumber": "INV-2026-0142",
"vendor": { "name": "Acme Corp" },
"total": 8118.75,
"currency": "USD"
}
}Scanned documents saved as PDF. Requires OCR processing. Available on Pro and Enterprise plans.
{
"documentType": "receipt",
"confidence": 0.87,
"data": {
"merchant": "Coffee Shop",
"items": [{ "name": "Latte", "price": 4.50 }],
"total": 4.50,
"ocrApplied": true
}
}.jpg, .jpeg
Photos of documents, scanned invoices, receipts. OCR extracts text from the image before parsing.
{
"documentType": "id_document",
"confidence": 0.89,
"data": {
"type": "passport",
"fullName": "John Smith",
"documentNumber": "AB1234567",
"expiryDate": "2030-12-31",
"ocrApplied": true
}
}.png
PNG images of documents. Supports transparency. Best for screenshots and digital scans.
{
"documentType": "bank_statement",
"confidence": 0.85,
"data": {
"accountHolder": "Jane Doe",
"bankName": "First National",
"transactions": [
{ "date": "2026-03-01", "description": "Direct Deposit", "amount": 3500.00 }
],
"ocrApplied": true
}
}.webp
Modern image format with smaller file sizes. Great for web-captured document images.
{
"documentType": "receipt",
"confidence": 0.88,
"data": {
"merchant": "Electronics Store",
"total": 299.99,
"paymentMethod": "Credit Card",
"ocrApplied": true
}
}.tiff, .tif
High-quality scanned documents. Common in enterprise and legal environments.
{
"documentType": "contract",
"confidence": 0.83,
"data": {
"parties": ["Acme Corp", "Widget Inc"],
"effectiveDate": "2026-01-15",
"ocrApplied": true
}
}.csv, .xlsx, .xls
Spreadsheet files with structured data. Auto-detect column headers and extract structured rows.
{
"documentType": "spreadsheet",
"status": "coming_soon",
"data": {
"headers": ["Date", "Description", "Amount"],
"rows": 150,
"sheets": 1
}
}ParseFlow auto-detects the document type, or you can specify it via the document_type parameter.
invoicereceiptcontractid_documentbank_statementScanned PDFs and images (JPG, PNG, WebP, TIFF) require Optical Character Recognition. Our OCR engine supports 50+ languages with high accuracy, available on Pro and Enterprise plans.
One of the most common use cases is invoices — see the dedicated invoice extraction API for the full field list and example responses.
Ready to start processing documents?