Skip to main content
Skip table of contents

pdfOCR: How to run ONNX models on GPU

The release of pdfOCR 5.0.0 introduced optional GPU acceleration for pdfOCR ONNX engine, which not only lets the CPU handle other tasks but can also result in major performance gains.

ONNX Runtime supports multiple execution providers for hardware acceleration, although not all are ready for production. At present, we have only tested pdfOCR using Nvidia CUDA-enabled GPUs, so you should refer to Onnx Runtime’s official docs on execution providers for other hardware.

Requirements

Windows builds require the latest Visual C++ runtime. For Linux and macOS a compatible OpenCvSharp4.runtime.* dependency is required.

For the ONNX Runtime GPU package, it is required to install CUDA and cuDNN. See Install ONNX Runtime for the complete list of requirements and installation guides.

Java

In order to run ONNX models on GPU, you need to add pdfocr-onnx-abstract and onnxruntime_gpu dependencies as mentioned in Installing iText pdfOCR for Java developers and PdfOCR Tutorials:

CODE
<dependencies>
  <dependency>
    <groupId>com.itextpdf</groupId>
    <artifactId>pdfocr-onnx-abstract</artifactId>
    <version>${itext.pdfocr.version}</version>
  </dependency>
  <!-- OnnxRuntime GPU to use with pdfocr-onnx-abstract. Source: https://mvnrepository.com/artifact/com.microsoft.onnxruntime/onnxruntime_gpu -->
  <dependency>
    <groupId>com.microsoft.onnxruntime</groupId>
    <artifactId>onnxruntime_gpu</artifactId>
    <version>${onnxruntime.version}</version>
  </dependency>
</dependencies>

.NET

In order to run ONNX models on GPU, you need to add pdfocr.onnx.abstract and Microsoft.ML.OnnxRuntime.Gpu dependencies as mentioned in Installing iText pdfOCR for .NET developersand PdfOCR Tutorials:

CODE
<ItemGroup>
  <PackageReference Include="itext.pdfocr.onnx.abstract" Version="5.0.0" />
  <PackageReference Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.24.4" />
</ItemGroup>

It’s possible to use other Execution Providers instead of Microsoft.ML.OnnxRuntime.Gpu, e.g. Microsoft.ML.OnnxRuntime.DirectML.

Configuration

No additional configuration is required to run models on GPU: com.itextpdf.pdfocr.onnx.DefaultOrtSessionOptionsCreator supports GPU mode by default.

Although if you want to set up some custom SessionOptions used to construct OrtSession which wraps an ONNX model and allows inference calls, it is possible. You can specify whether to run OCR on GPU or CPU (choose providers), execution mode, optimization level, and other options.

E.g. for CPU mode we use onnxruntime to run models, and at the ONNX Runtime level it should use all available CPU cores (for text detection and recognition), it’s configured here:

setIntraOpNumThreads(-1) - uses all available cores for parallel computation within individual operations
setInterOpNumThreads(-1) - uses all available cores for executing multiple request concurrently, if executing on a CPU

But it’s possible to override these SessionOptions properties.

The following code sample shows an example of customizing the ONNX Runtime session configuration used by the pdfOCR ONNX-based engine when creating searchable PDFs from images: pdfOCR: Custom session options for an ONNX model.

Compatible PaddleOCR/EasyOCR models already converted to ONNX format are available from our Hugging Face repository.

Performance: CPU vs GPU

To test the performance on CPU vs GPU, we performed OCR on a 50-page scanned document, entirely text. You can compare the results in the tables below.

Machines used for testing:

  • Windows 11 25H2: Intel Core Ultra 7 265KF, NVIDIA GeForce RTX 5060 Ti

  • Windows 10 22H2: AMD Ryzen 7 9800X3D, NVIDIA GeForce RTX 5080

.NET

.NET 8

.NET Framework 4.6.1

.NET Core App 2.0

Models:

PaddleOCR

docTR

PaddleOCR

docTR

PaddleOCR

docTR

OS + Processor

CPU

GPU

CPU

GPU

CPU

GPU

CPU

GPU

CPU

GPU

CPU

GPU

Windows + Intel 7 265KF

2:02

0:53

1:39

0:39

2:27

1:42

1:38

0:44

2:14

0:59

1:29

0:42

Windows + AMD 7 9800X3D

1:47

0:47

1:34

0:30

1:48

0:56

1:35

0:58

1:42

0:43

1:23

0:34

Java

Java 21

Models:

PaddleOCR

docTR

OS + Processor

CPU

GPU

CPU

GPU

Windows + Intel 7 265KF

1:48

0:41

1:06

0:18

Windows + AMD 7 9800X3D

1:10

0:25

0:59

0:13

Models

For PaddleOCR we used PP-OCRv5_mobile_det + PP-OCRv5_mobile_rec models.

For docTR we used rep_fast_tiny-28867779.onnx + crnn_vgg16_bn-662979cc.onnx models.

Test code

Java

CODE
File inPdfFile = new File(TEST_PDFS_DIRECTORY + "sample-ocr-test.pdf");
File outPdfFile = new File(DESTINATION_FOLDER + "ocrPdfCreatorMakeSearchable.pdf");

// Or OcrEngineType.PADDLE.get()
new OcrPdfCreator(OcrEngineType.DOCTR.get()).makePdfSearchable(inPdfFile, outPdfFile);

For OcrEngineType see OcrEngineType.java

.NET

CODE
FileInfo inPdfFile = new FileInfo(TEST_PDFS_DIRECTORY + "sample-ocr-test.pdf");
FileInfo outPdfFile = new FileInfo(DESTINATION_FOLDER + "ocrPdfCreatorMakeSearchable.pdf");

// Or OcrEngineType.PADDLE.Get()
new OcrPdfCreator(OcrEngineType.DOCTR.Get()).MakePdfSearchable(inPdfFile, outPdfFile);

For OcrEngineType see OcrEngineType.cs

More internal testing results

This section provides some internal testing results, but they’ve been done before huge performance improvements for .NET. So don’t rely on the execution time.

  1. Hardware used:

OS: Windows 10 22H2
CPU: AMD Ryzen 7 9800X3D
GPU: NVIDIA GeForce RTX 5080

Performed OCR on multipage TIFF image.

Java

Provider:

CPU

CUDA

100 runs time (avg)

303 s

243 s

1 run (avg)

3 s

2.4 s

CPU load (avg)

40%

10%

RAM (avg)

5.5 GB

1.5 GB

GPU (avg)

almost idle

7%

GPU memory (avg)

1.5 GB

4.6 GB

.NET Framework 4.6.1

Provider:

CPU

CUDA

100 runs time (avg)

36 m 51 s

36 m 34 s

1 run (avg)

22.1 s

21.9 s

CPU load avg)

33%

13%

RAM (avg)

2 GB

1.5 GB

GPU (avg)

almost idle

8%

GPU memory (avg)

1.3 GB

3.5 GB

  1. Hardware used:

OS: Windows 11 25H2
CPU: Core Ultra 7 265KF
GPU: NVIDIA GeForce 5060Ti

Java

  • Java 100 iterations:

    • CPU 8m 57s

    • CUDA 3m 55s

 

CPU

CPU + CUDA (almost CPU)

CUDA

pdfOcr-onnx tests

~3m

~3m

~50s

RAM (max)

4GB

3GB

1GB

CPU load

100%

Almost 100%

Almost idle

GPU memory (max)

0GB

1GB

7GB

GPU

Idle

Almost idle

Up to 20%

.NET

  • .NET (NetCoreApp2.0) 30 iterations

    • CPU 5m 51s

    • CUDA 4m 21s

  • .NET (.NET Framework 4.6.1) 30 iterations

    • CPU 6m 42s

    • CUDA 5m 46s

 


CPU

CUDA

pdfOcr-onnx tests

~3m

~2m 30s

RAM (max)

5GB

2GB

CPU load

avg 50%

Almost idle

GPU memory (max)

1GB

8GB

GPU

Almost idle

Up to 30%

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.