pdfOCR module - onnxTR

With the release of iText Suite 9.3.0 (pdfOCR 4.1.0), we have also released a new pdfOCR module, called pdfocr-onnxtr, which enables us to use Open Neural Network Exchange (ONNX) compatible models with iText.

It is really super easy, especially if you are already familiar with the pdfOCR API (Java/.NET).

If you haven’t installed it, you can find the Java installation instructions here and for .NET here.

Java

JAVA

IDetectionPredictor detectionPredictor = OnnxDetectionPredictor.fast(FAST);
IRecognitionPredictor recognitionPredictor = OnnxRecognitionPredictor.crnnVgg16(CRNNVGG16);

try (OnnxTrOcrEngine ocrEngine = new OnnxTrOcrEngine(detectionPredictor, recognitionPredictor)) {
    OcrPdfCreator ocrPdfCreator = new OcrPdfCreator(ocrEngine);
    try (PdfWriter writer = new PdfWriter(PATH_TO_OUTPUT_PDF)) {
        String imagePath = "src/images/rotatedBy90Degrees.png";
        PdfDocument pdf = ocrPdfCreator.createPdf(Collections.singletonList(new File(imagePath)), writer);
        pdf.close();
    }
}

You will notice, though, that with the OnnxTrOcrEngine (Java/.NET) constructor, there are two arguments that go into it.

Detection - this is the predictor that identifies where there is text present in the document.
Recognition - this is the predictor that identified which is present where the detection predictor said it would.

Even though by supporting ONNX we theoretically support multiple engines, the current ones we recommend are the following:

Felix92/onnxtr-fast-tiny for detection
Felix92/doctr-dummy-torch-crnn-vgg16-bn for recognition

You will have to download the model .onnx files and use them for OnnxDetectionPredictor.fast() and OnnxRecognitionPredictor.crnnVgg16() respectively (Java/.NET)

More examples could be found on our GitHub for Java and .NET:

`PdfOcrOnnxTrExample`	Performs OCR using provided `OnnxTrOcrEngine` for the given list of input images and saves output to a PDF file using provided path.
`PdfOcrOnnxTrMultilingualExample`	Performs OCR using onnxtr-parseq-multilingual-v1.onnx recognition model for the given list of input images with different latin languages. Also, this example demonstrates how to show the recognition result using `OcrPdfCreatorProperties` to set color for recognized text.
`PdfOcrOnnxTrPdfAsInputExample`	Performs OCR of all images in an input PDF file and generates searchable PDF using provided `OnnxTrOcrEngine`.
`PdfOcrOnnxTrTextPositioningExample`	Defines the way text is retrieved from ocr engine output specifying `TextPositioning` (to collect text by lines or by words) in `OnnxTrEngineProperties` in order to perform OCR using provided `OnnxTrOcrEngine` for the given images. Saves output to a PDF file.
`PdfOcrOnnxTrTxtFileExample`	Performs OCR using provided `OnnxTrOcrEngine` for the given list of input images and saves output to a text file using provided path.

For complete tests that are part of our functional tests, be sure to check our GitHub repository for our Java and .NET tests.