Skip to main content
Skip table of contents

pdfOCR module - Onnx

With the release of iText Suite 9.6.0 (pdfOCR 5.0.0), we also released new pdfOCR modules, called pdfocr-onnx-abstract and pdfocr-onnx-cpu, which enable the use of Open Neural Network Exchange (ONNX) compatible models with iText.

The new modules bring additional ONNX model support and optional GPU acceleration, along with other improvements. Therefore, they replace pdfOCR module - onnxTR - [Deprecated] from earlier releases.

It is really super easy, especially if you are already familiar with the pdfOCR API (Java/.NET).

If you haven’t installed it, you can find the Java installation instructions here and for .NET here.

Java

JAVA
IDetectionPredictor detectionPredictor = OnnxDetectionPredictor.paddleOcr(DETECTION);
IRecognitionPredictor recognitionPredictor = OnnxRecognitionPredictor.paddleOcr(RECOGNITION);

try (OnnxOcrEngine ocrEngine = new OnnxOcrEngine(detectionPredictor, recognitionPredictor)) {
    OcrPdfCreator ocrPdfCreator = new OcrPdfCreator(ocrEngine);
    try (PdfWriter writer = new PdfWriter(PATH_TO_OUTPUT_PDF)) {
        String imagePath = "src/images/image.png";
        PdfDocument pdf = ocrPdfCreator.createPdf(Collections.singletonList(new File(imagePath)), writer);
        pdf.close();
    }
}

You will notice, though, that with the OnnxOcrEngine (Java/.NET) constructor, there are two arguments that go into it.

  • Detection - the predictor that identifies where text appears in the document.

  • Recognition - the predictor that identifies what the text is at the location detected by the detection predictor.

By supporting ONNX we can support multiple engines (currently docTR, PaddleOCR, and EasyOCR). You need to download the ONNX model(s) you’d want to use (you will need to specify them with OnnxOcrEngine).

You can find a wide range of compatible PaddleOCR/EasyOCR models from the following Hugging Face repository:

For docTR, we currently recommend the following models:

You will have to download the model .onnx files and use them for OnnxDetectionPredictor.fast() and OnnxRecognitionPredictor.crnnVgg16() respectively (Java/.NET)

More examples can be found on our GitHub for Java and .NET:

PdfOcrOnnxExample

Performs OCR using the provided OnnxOcrEngine for the given list of input images and saves output to a PDF file using the provided path.

PdfOcrOnnxMultilingualExample

Performs OCR using the onnxtr-parseq-multilingual-v1.onnx recognition model for the given list of input images with different latin languages.

Also, this example demonstrates how to show the recognition result using OcrPdfCreatorProperties to set color for recognized text.

PdfOcrOnnxPdfAsInputExample

Performs OCR of all images in an input PDF file and generates a searchable PDF using the provided OnnxOcrEngine.

PdfOcrOnnxTextPositioningExample

Defines the way text is retrieved from OCR engine output specifying TextPositioning (to collect text by lines or by words) in OnnxEngineProperties in order to perform OCR using the provided OnnxOcrEngine for the given images. Saves output to a PDF file.

PdfOcrOnnxTxtFileExample

Performs OCR using provided OnnxOcrEngine for the given list of input images and saves output to a text file using the provided path.

CustomOnnxRuntimeSessionOptionsExample

Shows how to provide custom ai.onnxruntime.OrtSession.SessionOptions used to construct OrtSession which wraps an ONNX model and allows inference calls. This will allow to specify whether to run OCR on GPU or CPU, execution mode, optimization level and other options.

In order to run models on GPU, add pdfocr-onnx-abstract and onnxruntime_gpu dependencies. com.itextpdf.pdfocr.onnx.DefaultOrtSessionOptionsCreator supports GPU mode by default, so no additional changes required unless you want to set up some custom options.

PdfOcrOnnxPaddleOcrExample

Shows how to perform OCR using OnnxOcrEngine and PaddleOCR ML-models for the given list of input images, and save output to a PDF file using the provided path.

PaddleOCR models converted to ONNX format can be found at https://huggingface.co/itextresearch.

PdfOcrOnnxEasyOcrExample

Shows how to perform OCR using OnnxOcrEngine and EasyOCR ML-models for the given list of input images, and save output to a PDF file using the provided path.

EasyOCR models converted to ONNX format can be found at https://huggingface.co/itextresearch.

PdfOcrOnnxDisableArbitraryRotationExample

Shows how to disable arbitrary rotation for OCR result for the given list of input images. As a result of this particular example, only 0, 90, 180 and 270 degrees text rotation will be used.

EasyOcrDisableTextBoxMergerExample

Shows how to perform OCR using OnnxOcrEngine and EasyOCR ML-models for the given list of input images, while disabling the text box merging algorithm for EasyOCR’s detection post-processor.

For complete tests that are part of our functional tests, be sure to check our GitHub repository for our Java and .NET tests.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.