Get started

Extract tables from PDF to Excel/CSV

Extract tables from any PDF straight into XLSX or CSV. Works on multi-page tables and merged cells. Powered by ML-based table detection.

Extract Tables pulls structured tabular data out of PDFs straight into Excel or CSV — preserving rows, columns, and merged cells. PDFs store tables as positioned text, not as semantic table structures, which is why copy-pasting a PDF table into Excel usually produces a single-column mess. Our table-detection ML identifies table boundaries, infers column alignment, and outputs clean spreadsheet rows. Best results come from PDFs with clearly-defined tables (financial reports, scientific papers, government datasets); fully-merged or visually irregular tables may need light cleanup.

How to extract tables step by step

  1. 1

    Upload the PDF containing tables

    Drop the PDF. We accept any standard PDF up to 100 MB free, 200 MB Pro. Both digital and scanned PDFs work — OCR runs automatically on scans.

  2. 2

    We detect tables across all pages

    ML-based table detection identifies table regions (headers, body, footer rows). Multi-page tables are stitched: if your bank statement runs across 4 pages, the output is one continuous spreadsheet.

  3. 3

    Pick output format and download

    Choose XLSX (preserves merged cells, multi-sheet for multiple tables) or CSV (one file per table, simpler for scripting). Download the result, open in Excel/Google Sheets, and start analyzing.

Why extract tables on PDFOnly

ML-based detection, not regex tricks

Most online PDF-to-CSV tools use heuristics like 'split on tabs' that fail on complex tables. We use trained models that actually understand table structure.

Multi-page table stitching

When a table spans pages, we detect the continuation and produce one unified output rather than fragmenting it per page.

Handles scanned PDFs

Most tools fail silently on scans. We auto-OCR first, then extract — getting you working data even from old paper documents.

What people use extract tables for

A few common scenarios. If your workflow looks like one of these, this tool is a good fit.

Convert bank statements to a spreadsheet

Months of bank-statement PDFs become an analyzable dataset. Pivot, sum, categorize — all the things you can't do while the data is locked in a PDF.

Pull financial figures from quarterly reports

Income statements, balance sheets, and cash flow tables from 10-Q filings extract cleanly into Excel for your own modeling.

Migrate legacy lab data

Decades of scientific reports stored as PDFs become workable datasets. Run OCR first if they're scanned, then extract.

Aggregate competitor pricing pages

Public price lists in PDF format become a comparison spreadsheet. Useful for B2B competitive analysis.

What you get

  • Output as XLSX (Excel) or CSV — pick what your downstream tool wants
  • Multi-page tables stitched together automatically
  • Merged cells preserved correctly in XLSX output
  • Numeric formatting (currency, percentages, dates) inferred from context
  • Works on both digital and scanned PDFs (OCR runs first if needed)
  • Files auto-deleted within an hour, never used to train AI

Frequently asked questions

How accurate is table extraction?

On clean digital PDFs with well-defined tables (most modern reports): 95%+ accuracy. On scanned PDFs: 80-90%. On complex tables with merged cells, multi-row headers, or visual-only borders: 70-85%, may need spot fixes in Excel.

Can it handle merged cells?

Yes for XLSX output — merged cells in the source PDF stay merged. CSV doesn't have a merged-cells concept, so the merged value gets repeated across the cells it spans.

What if my PDF has multiple tables?

Each detected table becomes its own sheet (XLSX) or its own file (CSV). The output ZIP contains one CSV per table named table_01.csv, table_02.csv, etc.

Does it work for financial PDFs like 10-K reports?

Yes — financial reports are one of our best-performing input types. Income statements, balance sheets, cash flow statements all extract cleanly into Excel.

Can I extract a specific table only?

Specify page ranges in the options panel to limit extraction to those pages. The tool still detects all tables on those pages but ignores the rest of the document.

What about non-table data on the same pages?

Non-table content (paragraphs, headers, footers) is ignored. We only extract regions identified as tables. If the detection misses a table, try cropping to that page first.

Why is this on the Pro plan?

Table detection uses ML inference on every page, which has real compute cost. Free tools either use weak heuristics that fail on real-world PDFs or aggressively rate-limit. We charge transparently and deliver consistent quality.

Ready to extract tables?

Free to use for the basics. Files are auto-deleted within an hour and never used to train AI.

Open Extract Tables