Skip to main content
Skip table of contents

Which languages are supported in pdfOCR?

Since pdfOCR relies on Tesseract 4.1, you can always retrieve the latest language (+130!) data models and scripts (+35) on their GitHub repo.

If you require fonts to render advanced typography on a separate layer of your OCR document, please check out pdfCalligraph.

don't forget to specify the path to your Tesseract Data in your code with the Tesseract4OcrEngineProperties (Java/.NET) class

don't forget to specify the languages you want to OCR with the Tesseract4OcrEngineProperties (Java/.NET) class

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.