Skip to main content
Skip table of contents

How do I create a separate OCR layer?

By default, pdfOCR merges the recognized text into the image that just got processed, but you may want to keep this information separated. To do this, all you need is under the OcrPdfCreatorProperties (Java/.NET) class.

With it, you can define:

  • If you want a separate text layer (either of the two options below will trigger the creation of a text layer)
    • by defining its name (Java/.NET
    • by defining its color (Java/.NET) - bear in mind that if you do not define this parameter, the text will be transparent
  • If you want a separate image layer

Here's a quick example with all bells and whistles turned on (all previously listed options being used):

don't forget to specify the path to your Tesseract Data in your code TESS_DATA_FOLDER below. You can always find trained models here.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.