pdfOCR: How to run ONNX models on GPU
The release of pdfOCR 5.0.0 introduced optional GPU acceleration for pdfOCR ONNX engine, which not only lets the CPU handle other tasks but can also result in major performance gains.
ONNX Runtime supports multiple execution providers for hardware acceleration, although not all are ready for production. At present, we have only tested pdfOCR using Nvidia CUDA-enabled GPUs, so you should refer to Onnx Runtime’s official docs on execution providers for other hardware.
Requirements
Windows builds require the latest Visual C++ runtime. For Linux and macOS a compatible OpenCvSharp4.runtime.* dependency is required.
For the ONNX Runtime GPU package, it is required to install CUDA and cuDNN. See Install ONNX Runtime for the complete list of requirements and installation guides.
Java
In order to run ONNX models on GPU, you need to add pdfocr-onnx-abstract and onnxruntime_gpu dependencies as mentioned in Installing iText pdfOCR for Java developers and PdfOCR Tutorials:
<dependencies>
<dependency>
<groupId>com.itextpdf</groupId>
<artifactId>pdfocr-onnx-abstract</artifactId>
<version>${itext.pdfocr.version}</version>
</dependency>
<!-- OnnxRuntime GPU to use with pdfocr-onnx-abstract. Source: https://mvnrepository.com/artifact/com.microsoft.onnxruntime/onnxruntime_gpu -->
<dependency>
<groupId>com.microsoft.onnxruntime</groupId>
<artifactId>onnxruntime_gpu</artifactId>
<version>${onnxruntime.version}</version>
</dependency>
</dependencies>
.NET
In order to run ONNX models on GPU, you need to add pdfocr.onnx.abstract and Microsoft.ML.OnnxRuntime.Gpu dependencies as mentioned in Installing iText pdfOCR for .NET developersand PdfOCR Tutorials:
<ItemGroup>
<PackageReference Include="itext.pdfocr.onnx.abstract" Version="5.0.0" />
<PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.24.4" />
</ItemGroup>
It’s possible to use other Execution Providers instead of Microsoft.ML.OnnxRuntime.Gpu, e.g. Microsoft.ML.OnnxRuntime.DirectML.
Configuration
No additional configuration is required to run models on GPU: com.itextpdf.pdfocr.onnx.DefaultOrtSessionOptionsCreator supports GPU mode by default.
Although if you want to set up some custom SessionOptions used to construct OrtSession which wraps an ONNX model and allows inference calls, it is possible. You can specify whether to run OCR on GPU or CPU (choose providers), execution mode, optimization level, and other options.
E.g. for CPU mode we use onnxruntime to run models, and at the ONNX Runtime level it should use all available CPU cores (for text detection and recognition), it’s configured here:
setIntraOpNumThreads(-1)- uses all available cores for parallel computation within individual operationssetInterOpNumThreads(-1)- uses all available cores for executing multiple request concurrently, if executing on a CPU
But it’s possible to override these SessionOptions properties.
The following code sample shows an example of customizing the ONNX Runtime session configuration used by the pdfOCR ONNX-based engine when creating searchable PDFs from images: pdfOCR: Custom session options for an ONNX model.
Compatible PaddleOCR/EasyOCR models already converted to ONNX format are available from our Hugging Face repository.
Performance: CPU vs GPU
To test the performance on CPU vs GPU, we performed OCR on a 50-page scanned document, entirely text. You can compare the results in the tables below.
Machines used for testing:
Windows 11 25H2: Intel Core Ultra 7 265KF, NVIDIA GeForce RTX 5060 Ti
Windows 10 22H2: AMD Ryzen 7 9800X3D, NVIDIA GeForce RTX 5080
.NET
.NET 8 | .NET Framework 4.6.1 | .NET Core App 2.0 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
Models: | PaddleOCR | docTR | PaddleOCR | docTR | PaddleOCR | docTR | ||||||
OS + Processor | CPU | GPU | CPU | GPU | CPU | GPU | CPU | GPU | CPU | GPU | CPU | GPU |
Windows + Intel 7 265KF | 2:02 | 0:53 | 1:39 | 0:39 | 2:27 | 1:42 | 1:38 | 0:44 | 2:14 | 0:59 | 1:29 | 0:42 |
Windows + AMD 7 9800X3D | 1:47 | 0:47 | 1:34 | 0:30 | 1:48 | 0:56 | 1:35 | 0:58 | 1:42 | 0:43 | 1:23 | 0:34 |
Java
Java 21 | ||||
|---|---|---|---|---|
Models: | PaddleOCR | docTR | ||
OS + Processor | CPU | GPU | CPU | GPU |
Windows + Intel 7 265KF | 1:48 | 0:41 | 1:06 | 0:18 |
Windows + AMD 7 9800X3D | 1:10 | 0:25 | 0:59 | 0:13 |
Models
For PaddleOCR we used PP-OCRv5_mobile_det + PP-OCRv5_mobile_rec models.
For docTR we used rep_fast_tiny-28867779.onnx + crnn_vgg16_bn-662979cc.onnx models.
Test code
Java
File inPdfFile = new File(TEST_PDFS_DIRECTORY + "sample-ocr-test.pdf");
File outPdfFile = new File(DESTINATION_FOLDER + "ocrPdfCreatorMakeSearchable.pdf");
// Or OcrEngineType.PADDLE.get()
new OcrPdfCreator(OcrEngineType.DOCTR.get()).makePdfSearchable(inPdfFile, outPdfFile);
For OcrEngineType see OcrEngineType.java
.NET
FileInfo inPdfFile = new FileInfo(TEST_PDFS_DIRECTORY + "sample-ocr-test.pdf");
FileInfo outPdfFile = new FileInfo(DESTINATION_FOLDER + "ocrPdfCreatorMakeSearchable.pdf");
// Or OcrEngineType.PADDLE.Get()
new OcrPdfCreator(OcrEngineType.DOCTR.Get()).MakePdfSearchable(inPdfFile, outPdfFile);
For OcrEngineType see OcrEngineType.cs
More internal testing results
This section provides some internal testing results, but they’ve been done before huge performance improvements for .NET. So don’t rely on the execution time.
Hardware used:
OS: Windows 10 22H2
CPU: AMD Ryzen 7 9800X3D
GPU: NVIDIA GeForce RTX 5080
Performed OCR on multipage TIFF image.
Java
Provider: | CPU | CUDA |
|---|---|---|
100 runs time (avg) | 303 s | 243 s |
1 run (avg) | 3 s | 2.4 s |
CPU load (avg) | 40% | 10% |
RAM (avg) | 5.5 GB | 1.5 GB |
GPU (avg) | almost idle | 7% |
GPU memory (avg) | 1.5 GB | 4.6 GB |
.NET Framework 4.6.1
Provider: | CPU | CUDA |
|---|---|---|
100 runs time (avg) | 36 m 51 s | 36 m 34 s |
1 run (avg) | 22.1 s | 21.9 s |
CPU load avg) | 33% | 13% |
RAM (avg) | 2 GB | 1.5 GB |
GPU (avg) | almost idle | 8% |
GPU memory (avg) | 1.3 GB | 3.5 GB |
Hardware used:
OS: Windows 11 25H2
CPU: Core Ultra 7 265KF
GPU: NVIDIA GeForce 5060Ti
Java
Java 100 iterations:
CPU 8m 57s
CUDA 3m 55s
| CPU | CPU + CUDA (almost CPU) | CUDA |
|---|---|---|---|
pdfOcr-onnx tests | ~3m | ~3m | ~50s |
RAM (max) | 4GB | 3GB | 1GB |
CPU load | 100% | Almost 100% | Almost idle |
GPU memory (max) | 0GB | 1GB | 7GB |
GPU | Idle | Almost idle | Up to 20% |
.NET
.NET (NetCoreApp2.0) 30 iterations
CPU 5m 51s
CUDA 4m 21s
.NET (.NET Framework 4.6.1) 30 iterations
CPU 6m 42s
CUDA 5m 46s
|
| CUDA |
|---|---|---|
pdfOcr-onnx tests | ~3m | ~2m 30s |
RAM (max) | 5GB | 2GB |
CPU load | avg 50% | Almost idle |
GPU memory (max) | 1GB | 8GB |
GPU | Almost idle | Up to 30% |