Question 1

Does it preserve formatting?

Accepted Answer

It preserves text content and reading order. Formatting (bold, italic, font sizes) is not preserved — that's what plain text means. For rich-format conversion, use PDF to Word or PDF to Markdown.

Question 2

Why is the output empty or weird?

Accepted Answer

Most likely your PDF is a scan with no text layer. Run OCR PDF first, then come back here.

Question 3

Multi-column papers?

Accepted Answer

Poppler's text extraction handles standard 2-column academic layouts well. The output reads in correct column order.

Extract text from PDF online — free

Frequently asked questions