Skip to main content
Skip table of contents

pdf2Data 4.0 Command Line Interface


pdf2Data Command Line Interface allows extracting data from PDF files from the command line. The output format for data extraction is XML or JSON


To start PDF data capturing, you need to download the CLI application from the iText Artifactory.

Basically, you don`t need to configure your environment specifically, as long as you have Java 8, you can use pdf2Data CLI from the command line.

The steps are similar to the ones you would typically do in code. 


Process pdf2Date 4.0 template

java -jar cli.jar preprocess -s template.p2dta -d template.p2d

PDF to XML parsing

java -jar cli.jar parse -t template.p2d -s file_for_parsing.pdf -p recognized.pdf -x recognized.xml -l license.json

PDF to JSON parsing

java -jar cli.jar parse -t template.p2d -s file_for_parsing.pdf -p recognized.pdf -j recognized.json -l license.json

Help information

java -jar cli.jar -h
java -jar cli.jar --help
java -jar cli.jar preprocess -h
java -jar cli.jar preprocess --help
java -jar cli.jar parse -h
java -jar cli.jar parse --help


JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.