Tutorial: Build template for extracting data from PDF invoices.
Invoices from different contractors are often provided electronically in PDF format. While such files contain data which needs to be used and repurposed in internal business processes, it can be challenging to access this data. Thus, it commonly requires a person to manually copy the necessary data from such invoices and enter it into a database or other system to ensure the invoice is recorded and will be paid on time.
However, manually processing large volumes of documents by employees takes time, and moreover can lead to errors and typos that are difficult to track down. iText pdf2Data offers a better way to intelligently extract data from invoices and other documents in a simple and structured way. It takes advantage of the fact that documents such as invoices from a recurring supplier follow a common layout, with each subsequent invoice being largely the same document as the first, just with different data.
This tutorial shows how iText pdf2Data can help you make the first steps towards automating this process, freeing your employees from routine paperwork, and reducing the number of paid hours they previously spent on these tasks.
To get the most out of this tutorial you should have
- An understanding of the pdf2Data Editor UI and the basic concepts of data extraction.
- Deployed pdf2data Editor, or access to our trial instance.
Add a new document type for processing
To mass process all subsequent documents matching this template, you need to integrate the pdf2Data SDK into your Java or .NET application.
So, please have a look at the pdf2Data SDK installation guide.