Question 1

Does it handle vertical Japanese text?

Accepted Answer

Yes for modern documents printed vertically. For scanned old books or magazines with right-to-left page order, you may need to also use our Rotate PDF tool to handle page orientation before OCR.

Question 2

How accurate is Japanese OCR?

Accepted Answer

On clean printed scans at 300 DPI: 90-95% character accuracy. Handwritten Japanese is much harder and beyond the scope of standard Tesseract — for handwritten documents, dedicated tools like Google Document AI work better.

Question 3

Will furigana (small phonetic characters above Kanji) be recognized?

Accepted Answer

Furigana renders as inline characters in the OCR output, which can interfere with reading flow. For text intended for Japanese learners, this is usually fine. For clean text extraction, you may want to filter or post-process the OCR output to merge furigana with their parent Kanji.

OCR Japanese PDF — Kanji, Hiragana, Katakana

Frequently asked questions