Question 1

Will pre-1918 Russian spelling be recognized?

Accepted Answer

Reasonably well for the most common cases (the obsolete letters ѣ, і, ѳ, ѵ). For systematic processing of pre-revolution documents, you may want a specialized historical Russian OCR pipeline; standard Tesseract handles the basics.

Question 2

Can I OCR a mixed Russian + English document?

Accepted Answer

Yes — specify both languages in the picker. Useful for scientific papers, technical documentation with English code, or documents with proper names retained in Latin characters.

Question 3

What about Ukrainian, Bulgarian, or other Cyrillic-script languages?

Accepted Answer

We support those as separate language packs (Ukrainian, Bulgarian, Serbian, Belarusian, Macedonian, Mongolian). Pick the right language for your source document — using the wrong Cyrillic language pack drops accuracy significantly because letter frequencies and combinations differ.

OCR Russian PDF — Cyrillic recognition

Frequently asked questions