pdf2Data SDK

The pdf2Data SDK is a native Java (or .NET) application. Its primary function is to extract data from PDF files using predefined extraction rules.

The extracted data is output in either JSON or XML format.

Installation

Java

The preferred way to set up iText pdf2Data in Java is to use a build system like Maven or Gradle and download pdf2Data artifacts from the iText Artifactory located at https://repo.itextsupport.com/pdf2data/

The groupId is com.itextpdf.pdf2data, and the artifactId is pdf2data

In Maven, the configuration would look similar to the example below:

Maven

XML

<repository>
	<id>pdf2Data</id>
	<name>pdf2Data Maven Repository</name>
	<url>https://repo.itextsupport.com/pdf2data</url>
</repository>

<dependency>
	<groupId>com.itextpdf.pdf2data</groupId>
	<artifactId>pdf2data</artifactId>
	<version>4.0.0</version>
</dependency>

.NET

For .NET iText pdf2Data is distributed as a NuGet package which is available at NuGet.org or at iText Artifactory.

You can browse for the desired NuGet package manually or install it with the Install-Package itext7.pdf2data NuGet Package Manager command.

Integrating pdf2Data into your code

As from iText pdf2Data 4.0 the format of extraction templates has been changed, compared to iText pdf2Data 3.*. Please see the Migration guide to get to know more

With the pdf2Data Manager in iText pdf2Data 4.0, you can download templates optimized for use in the pdf2Data SDK, so you can extract data in two lines of code.

Extraction (Java)

Make sure to load the license file before invoking any code

JAVA

LicenseKey.loadLicenseFile(pathToLicenseFile);

The initialization of the Pdf2DataExtractor instance from a processed template should now be done with one function call:

JAVA

Pdf2DataExtractor extractor = Pdf2DataExtractor.create(new File(P2D_TEMPLATE_PATH));

Parse PDF using the extraction template

Perform extraction

JAVA

ParsingResult result = extractor.recognize(new File(P2D_TEMPLATE_PATH));

You can use extracted values directly from the result or save them in one of two structured formats

Save to XML

JAVA

result.saveToXml(new File(RESULT_XML_PATH));

Save to JSON

JAVA

result.saveToJson(new File(RESULT_JSON_PATH));