pdfOCR module - Onnx
With the release of iText Suite 9.6.0 (pdfOCR 5.0.0), we also released new pdfOCR modules, called pdfocr-onnx-abstract and pdfocr-onnx-cpu, which enable the use of Open Neural Network Exchange (ONNX) compatible models with iText.
The new modules bring additional ONNX model support and optional GPU acceleration, along with other improvements. Therefore, they replace pdfOCR module - onnxTR - [Deprecated] from earlier releases.
It is really super easy, especially if you are already familiar with the pdfOCR API (Java/.NET).
Java
IDetectionPredictor detectionPredictor = OnnxDetectionPredictor.paddleOcr(DETECTION);
IRecognitionPredictor recognitionPredictor = OnnxRecognitionPredictor.paddleOcr(RECOGNITION);
try (OnnxOcrEngine ocrEngine = new OnnxOcrEngine(detectionPredictor, recognitionPredictor)) {
OcrPdfCreator ocrPdfCreator = new OcrPdfCreator(ocrEngine);
try (PdfWriter writer = new PdfWriter(PATH_TO_OUTPUT_PDF)) {
String imagePath = "src/images/image.png";
PdfDocument pdf = ocrPdfCreator.createPdf(Collections.singletonList(new File(imagePath)), writer);
pdf.close();
}
}
You will notice, though, that with the OnnxOcrEngine (Java/.NET) constructor, there are two arguments that go into it.
Detection - the predictor that identifies where text appears in the document.
Recognition - the predictor that identifies what the text is at the location detected by the detection predictor.
By supporting ONNX we can support multiple engines (currently docTR, PaddleOCR, and EasyOCR). You need to download the ONNX model(s) you’d want to use (you will need to specify them with OnnxOcrEngine).
You can find a wide range of compatible PaddleOCR/EasyOCR models from the following Hugging Face repository:
For docTR, we currently recommend the following models:
Felix92/onnxtr-fast-tiny for detection
Felix92/doctr-dummy-torch-crnn-vgg16-bn for recognition
More examples can be found on our GitHub for Java and .NET:
| Performs OCR using the provided |
| Performs OCR using the onnxtr-parseq-multilingual-v1.onnx recognition model for the given list of input images with different latin languages. Also, this example demonstrates how to show the recognition result using |
| Performs OCR of all images in an input PDF file and generates a searchable PDF using the provided |
| Defines the way text is retrieved from OCR engine output specifying |
| Performs OCR using provided |
| Shows how to provide custom In order to run models on GPU, add |
| Shows how to perform OCR using PaddleOCR models converted to ONNX format can be found at https://huggingface.co/itextresearch. |
| Shows how to perform OCR using EasyOCR models converted to ONNX format can be found at https://huggingface.co/itextresearch. |
| Shows how to disable arbitrary rotation for OCR result for the given list of input images. As a result of this particular example, only |
| Shows how to perform OCR using |