Skip to main content
Skip table of contents

pdfOCR module - onnxTR

With the release of iText Suite 9.3.0 (pdfOCR 4.1.0), we have also released a new pdfOCR module, called pdfocr-onnxtr, which enables us to use Open Neural Network Exchange (ONNX) compatible models with iText.

It is really super easy, especially if you are already familiar with the pdfOCR API (Java/.NET).

If you haven’t installed it, you can find the Java installation instructions here and for .NET here.

Java

JAVA
IDetectionPredictor detectionPredictor = OnnxDetectionPredictor.fast(FAST);
IRecognitionPredictor recognitionPredictor = OnnxRecognitionPredictor.crnnVgg16(CRNNVGG16);

try (OnnxTrOcrEngine ocrEngine = new OnnxTrOcrEngine(detectionPredictor, recognitionPredictor)) {
    OcrPdfCreator ocrPdfCreator = new OcrPdfCreator(ocrEngine);
    try (PdfWriter writer = new PdfWriter(PATH_TO_OUTPUT_PDF)) {
        String imagePath = "src/images/rotatedBy90Degrees.png";
        PdfDocument pdf = ocrPdfCreator.createPdf(Collections.singletonList(new File(imagePath)), writer);
        pdf.close();
    }
}

You will notice, though, that with the OnnxTrOcrEngine (Java/.NET) constructor, there are two arguments that go into it.

  • Detection - this is the predictor that identifies where there is text present in the document.

  • Recognition - this is the predictor that identified which is present where the detection predictor said it would.

Even though by supporting ONNX we theoretically support multiple engines, the current ones we recommend are the following:

You will have to download the model .onnx files and use them for OnnxDetectionPredictor.fast() and OnnxRecognitionPredictor.crnnVgg16() respectively (Java/.NET)

More examples could be found on our GitHub for Java and .NET:

PdfOcrOnnxTrExample

Performs OCR using provided OnnxTrOcrEngine for the given list of input images and saves output to a PDF file using provided path.

PdfOcrOnnxTrMultilingualExample

Performs OCR using onnxtr-parseq-multilingual-v1.onnx recognition model for the given list of input images with different latin languages.

Also, this example demonstrates how to show the recognition result using OcrPdfCreatorProperties to set color for recognized text.

PdfOcrOnnxTrPdfAsInputExample

Performs OCR of all images in an input PDF file and generates searchable PDF using provided OnnxTrOcrEngine.

PdfOcrOnnxTrTextPositioningExample

Defines the way text is retrieved from ocr engine output specifying TextPositioning (to collect text by lines or by words) in OnnxTrEngineProperties in order to perform OCR using provided OnnxTrOcrEngine for the given images. Saves output to a PDF file.

PdfOcrOnnxTrTxtFileExample

Performs OCR using provided OnnxTrOcrEngine for the given list of input images and saves output to a text file using provided path.

For complete tests that are part of our functional tests, be sure to check our GitHub repository for our Java and .NET tests.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.