Skip to main content
Skip table of contents

iText 8.0.2 Delivers PDF/A-4 Support

Background

The PDF/A standards, established by the International Organization for Standardization (ISO), are essential for ensuring the lasting accessibility and reliability of electronic documents. Designed specifically for industries with strict archival needs, such as legal, finance, and government, PDF/A guidelines guarantee that documents meet specific criteria for consistent rendering and interpretation over time.

In these industries, where document integrity and longevity are paramount, PDF/A has become a common and widely adopted format. Its importance lies in preserving the content and structure of electronic documents for the long term. For instance, in legal practices, PDF/A ensures that legal documents retain their original format and are accessible for reference or legal proceedings even after many years. In the financial sector, where regulatory compliance is crucial, PDF/A helps maintain the integrity of financial records for auditing purposes.

The widespread use of PDF/A standards signifies their significance in addressing the archival challenges faced by various sectors. These standards play a pivotal role in safeguarding electronic documents, making them a trusted choice for industries where accuracy, consistency, and longevity are non-negotiable.

Support for generating documents which are PDF/A compliant is an important component of the iText and Apryse feature set.

Pre PDF/A-4

PDF/A has evolved significantly since its inception, with each version introducing new features and enhancements to improve its capabilities for long-term preservation. Here's a chronological overview of the major achievements and milestones of PDF/A prior to PDF/A-4:

PDF/A-1 (2005)

Based on a subset of the PDF 1.4 specification (ISO 32000-1), it established the initial foundation for PDF/A as a standardized format for long-term preservation of electronic documents by requiring documents to be completely self-contained.

  • Prohibited features that may hinder long-term preservation, such as encryption, audio, and video.

  • Required all fonts to be embedded.

  • Introduced support for color management.

  • Encouraged the use of metadata for document information.

  • PDF/A-1 also introduced the concept of conformance levels:

    • PDF/A-1b (for “basic”): Ensures the visual appearance of the document is preservable in the long term.

    • PDF/A-1a (for “accessible”): Based on level B, but as with the Tagged PDF standard, it required structure information and Unicode to preserve document structure and reading order. This means that not only will PDF/A-1a documents look the same in the future, but their contents can be reliably interpreted and be processed by accessibility software such as screen readers for the visually impaired.

PDF/A-2 (2011)

Based on PDF 1.7 (ISO 32000-1), PDF/A-2 included many additions to the standard:

  • Introduced support for embedding OpenType fonts, ensuring consistent rendering of text across different systems.

  • Allowed JPEG 2000 image compression, enabling smaller document file sizes.

  • Introduced support for file attachments, provided the attached files also conform to PDF/A-1 or PDF/A-2.

  • Digital signatures in accordance with the PDF Advanced Electronic Signatures (PAdES) standard were allowed.

  • Supported transparency and enhanced the color management support, providing more accurate color representation.

  • Since the dedicated PDF/UA-1 standard was being introduced for Universal Accessibility, PDF/A-2 refined the conformance levels to provide a clearer distinction between different levels of PDF/A compliance:

    • PDF/A-2a: Strict conformance (no external content)

    • PDF/A-2b: Basic conformance (external content allowed)

    • PDF/A-2u: Unicode mapping (for text extraction and search)

PDF/A-3 (2012)

PDF/A-3 is quite similar to PDF/A-2 and also supports the a, b. and u conformance levels. The main difference from PDF/A-2 is in regard to file attachments:

  • PDF/A-3 allows for the inclusion of arbitrary file types (not just PDFs) as attachments in PDF/A documents, thus making portable collections/packages (AKA PDF Portfolios) much more useful for archiving.

  • File attachments are associated with the whole document, a page, or some other part of the document. The nature of the relationship between an attached file and its corresponding part in the document needs to be explicitly defined, as source, alternative, or supplemental data, by using the AFRelationship key.

Why PDF/A-4

PDF/A-4, published in November 2020, was aimed to address the evolving needs of long-term preservation, accessibility, functionality, security, and compatibility, while maintaining the core principles of the PDF/A standard.

In 2017, specifications for the PDF 2.0 standard were published. This was a major milestone in PDF technology and the new standard introduced significant advancements which expanded the functionality and versatility of the format. These advancements include, but are not limited to, enhanced security features, accessibility improvements, embedded fonts, transparency controls and metadata handling.

Significantly, PDF A/-4 is the first PDF/A format which is built on top of PDF 2.0. This is a crucial development in leveraging the latest advancements in PDF technology. Other developments in PDF/A-4 are:

  • Enhanced Long-term Preservation: PDF/A-4 strengthens the long-term preservation capabilities of PDF/A documents by addressing potential obsolescence and ensuring consistent rendering over time. This reduces the risk of data loss or format incompatibility in the future, making PDF/A-4 a more reliable choice for archiving and preserving critical documents.

  • Improved Accessibility: PDF/A-4 enhances the accessibility of PDF/A documents for individuals with disabilities, making it easier for them to access and interact with PDF/A content. This includes improved tagging for screen readers, support for alternative text descriptions, and compliance with WCAG 2.1 accessibility guidelines.

  • Expanded Functionality: PDF/A-4 adds new features to expand the functionality of PDF/A, making it more versatile and adaptable to various document processing needs. This includes support for Rich Media content like audio and video, enhanced transparency effects for more complex designs, and improved embedded file handling for efficient document management.

  • Strengthened Security: PDF/A-4 focuses on enhancing the security of PDF/A documents by prohibiting actions that could compromise viewer security or privacy. This helps protect sensitive information and prevent malicious code from executing within PDF/A documents.

  • Maintain Backward Compatibility: PDF/A-4 maintains backward compatibility with previous versions of PDF/A, ensuring that existing PDF/A documents remain compliant with the updated standard. This compatibility allows users to transition seamlessly to PDF/A-4 without compromising the integrity of existing archives.

  • Simplified Compliance: PDF/A-4 provides clearer guidelines and improved validation tools to simplify compliance checking and ensure that PDF/A documents meet the standard's requirements. This reduces the risk of non-compliance and ensures that documents are consistently preserved for long-term access.

  • Enhanced Integration: PDF/A-4 is designed for better integration with document processing workflows, allowing for easier conversion, validation, and management of PDF/A documents within existing document processing systems. This streamlines document processing tasks and reduces the overhead associated with long-term preservation.

What we’ve done in iText 8.02

iText now allows users to create and validate PDF/A-4 compliant documents. In order to do this, additional development has been necessary to implement checks to support:

  • Signature Algorithms: PDF/A-4 supports a wider range of signatures including RSA, DSA, and ECDSA. It also supports the use of timestamps, which can be used to prove that a signature was created at a specific time. Signatures that rely on hash functions with known vulnerabilities, deprecated key algorithms, external certificates and non-propriety methods are unsupported.

  • Actions: Actions are embedded instructions that trigger specific behaviors within the documents, they are commonly used to add additional interaction and functionality to PDFs. Actions pose challenges in accessibility and security and in PDF/A-4, several actions (such as ‘Launch’, ‘ImportData’, JavaScript execution) are now prohibited.

  • Graphics: Embedding of fonts in documents is now mandatory, this aids preservation of visual integrity in documents when opened in different viewing environments and eliminates font substitution challenges

  • Metadata: Information relating to document identification (i.e. the archiving standard being used, the tool used to generate the document, time of creation and modification) is attached to all documents meeting the PDF/A-4 standard.

  • Output Intent: Key components such as Color Spacing, ICC Profiles, Black Point Compensation are defined to control how documents are rendered on different devices.

Example Usage

Core

Java
JAVA
package org.example;

import com.itextpdf.io.font.PdfEncodings;
import com.itextpdf.io.image.ImageDataFactory;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.kernel.pdf.*;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Image;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.pdfa.PdfADocument;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

public class PdfA4Example {
    public static final String DEST = "results/result.pdf";
    private static final String FONT = "src/main/resources/NotoSans-Regular.ttf";

    static {
        new File(DEST).getParentFile().mkdirs();
    }

    public static void main(String[] args) throws IOException {
        //PDF/a-4 requires a PDF 2.0 document
        PdfWriter writer = new PdfWriter(DEST, new WriterProperties().setPdfVersion(PdfVersion.PDF_2_0));
        //Grab the image color matching profile
        InputStream inputStream = new FileInputStream("src/main/resources/sRGB_CS_profile.icm");
        //Create the PDF/a-4 document by instantiating a PdfADocument object and passing the PDF/a-4 conformance level
        PdfADocument pdfDocument = new PdfADocument(writer, PdfAConformanceLevel.PDF_A_4, new PdfOutputIntent("Custom", "",
                null, "sRGB IEC61966-2.1", inputStream));
        //Taking care of the additional PDF/A requirements
        pdfDocument.getCatalog().setLang(new PdfString("nl-nl"));
        pdfDocument.setTagged();
        PdfDocumentInfo info = pdfDocument.getDocumentInfo();
        info
                .setTitle("title")
                .setAuthor("Author")
                .setSubject("Subject")
                .setCreator("Creator")
                .setKeywords("Metadata, iText, PDF")
                .setCreator("My program using iText")
                .addCreationDate();

        Document document = new Document(pdfDocument);
        //PDF/a requires fonts to be embedded
        PdfFont font = PdfFontFactory.createFont(FONT, PdfEncodings.IDENTITY_H);

        Paragraph element = new Paragraph("Hello World").setFont(font).setFontSize(10);
        document.add(element);

        Image logoImage = new Image(ImageDataFactory.create("src/main/resources/logo.png"));
        //PDF/a requires images to have alternative text
        logoImage.getAccessibilityProperties().setAlternateDescription("Logo");
        document.add(logoImage);

        pdfDocument.close();
    }
}
C#
C#
using System.IO;
using iText.Html2pdf;
using iText.Html2pdf.Resolver.Font;
using iText.IO.Font;
using iText.IO.Image;
using iText.Kernel.Font;
using iText.Kernel.Pdf;
using iText.Layout;
using iText.Layout.Element;
using iText.Layout.Font;
using iText.Pdfa;

namespace PdfA_Examples
{
    internal class Program
    {
        public static string DEST = "result.pdf";
        private static string FONT = "NotoSans-Regular.ttf";
        private static string HTML = "example.html";


        public static void Main(string[] args)
        {
            PdfA4Example();
        }

        private static void PdfA4Example()
        {
            //PDF/a-4 requires a PDF 2.0 document
            PdfWriter writer = new PdfWriter(DEST, new WriterProperties().SetPdfVersion(PdfVersion.PDF_2_0));
            //Grab the image color matching profile
            FileStream inputStream = new FileStream("sRGB_CS_profile.icm", FileMode.Open);
            //Create the PDF/a-4 document by instantiating a PdfADocument object and passing the PDF/a-4 conformance level
            PdfADocument pdfDocument = new PdfADocument(writer, PdfAConformanceLevel.PDF_A_4, new PdfOutputIntent("Custom", "",
                null, "sRGB IEC61966-2.1", inputStream));
            //Taking care of the additional PDF/A requirements
            pdfDocument.GetCatalog().SetLang(new PdfString("nl-nl"));
            pdfDocument.SetTagged();
            PdfDocumentInfo info = pdfDocument.GetDocumentInfo();
            info
                .SetTitle("title")
                .SetAuthor("Author")
                .SetSubject("Subject")
                .SetCreator("Creator")
                .SetKeywords("Metadata, iText, PDF")
                .SetCreator("My program using iText")
                .AddCreationDate();

            Document document = new Document(pdfDocument);
            //PDF/a requires fonts to be embedded
            PdfFont font = PdfFontFactory.CreateFont(FONT, PdfEncodings.IDENTITY_H);

            Paragraph element = new Paragraph("Hello World").SetFont(font).SetFontSize(10);
            document.Add(element);

            Image logoImage = new Image(ImageDataFactory.Create("logo.png"));
            //PDF/a requires images to have alternative text
            logoImage.GetAccessibilityProperties().SetAlternateDescription("Logo");
            document.Add(logoImage);

            pdfDocument.Close();
            inputStream.Close();
        }
    }
}

pdfHTML

Java
JAVA
package org.example;

import com.itextpdf.html2pdf.ConverterProperties;
import com.itextpdf.html2pdf.HtmlConverter;
import com.itextpdf.html2pdf.resolver.font.DefaultFontProvider;
import com.itextpdf.io.font.FontProgram;
import com.itextpdf.io.font.FontProgramFactory;
import com.itextpdf.kernel.pdf.*;
import com.itextpdf.layout.font.FontProvider;
import com.itextpdf.pdfa.PdfADocument;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;

public class PdfA4HtmlExample {
    public static final String DEST = "results/result.pdf";
    private static final String HTML = "src/main/resources/example.html";
    private static final String FONT = "src/main/resources/NotoSans-Regular.ttf";

    static {
        new File(DEST).getParentFile().mkdirs();
    }

    public static void main(String[] args) throws IOException {
        //PDF/a-4 requires a PDF 2.0 document
        PdfWriter writer = new PdfWriter(DEST, new WriterProperties().setPdfVersion(PdfVersion.PDF_2_0));
        //Grab the image color matching profile
        InputStream inputStream = new FileInputStream("src/main/resources/sRGB_CS_profile.icm");
        //Create the PDF/a-4 document by instantiating a PdfADocument object and passing the PDF/a-4 conformance level
        PdfADocument pdfDocument = new PdfADocument(writer, PdfAConformanceLevel.PDF_A_4, new PdfOutputIntent("Custom", "",
                null, "sRGB IEC61966-2.1", inputStream));
        //Taking care of the additional PDF/A requirements
        pdfDocument.getCatalog().setLang(new PdfString("nl-nl"));
        pdfDocument.setTagged();
        PdfDocumentInfo info = pdfDocument.getDocumentInfo();
        info
                .setTitle("title")
                .setAuthor("Author")
                .setSubject("Subject")
                .setCreator("Creator")
                .setKeywords("Metadata, iText, PDF")
                .setCreator("My program using iText")
                .addCreationDate();

        ConverterProperties properties = new ConverterProperties();
        FontProvider fontProvider = new DefaultFontProvider();
        FontProgram fontProgram = FontProgramFactory.createFont(FONT);
        fontProvider.addFont(fontProgram);
        properties.setFontProvider(fontProvider);
        HtmlConverter.convertToDocument(new FileInputStream(HTML), pdfDocument, properties);

        pdfDocument.close();
    }
}
C#
C#
using System.IO;
using iText.Html2pdf;
using iText.Html2pdf.Resolver.Font;
using iText.IO.Font;
using iText.IO.Image;
using iText.Kernel.Font;
using iText.Kernel.Pdf;
using iText.Layout;
using iText.Layout.Element;
using iText.Layout.Font;
using iText.Pdfa;

namespace PdfA_Examples
{
    internal class Program
    {
        public static string DEST = "result.pdf";
        private static string FONT = "NotoSans-Regular.ttf";
        private static string HTML = "example.html";


        public static void Main(string[] args)
        {
            PdfA4HtmlExample();
        }

        private static void PdfA4HtmlExample()
        {
            //PDF/a-4 requires a PDF 2.0 document
            PdfWriter writer = new PdfWriter(DEST, new WriterProperties().SetPdfVersion(PdfVersion.PDF_2_0));
            //Grab the image color matching profile
            FileStream inputStream = new FileStream("sRGB_CS_profile.icm", FileMode.Open);
            //Create the PDF/a-4 document by instantiating a PdfADocument object and passing the PDF/a-4 conformance level
            PdfADocument pdfDocument = new PdfADocument(writer, PdfAConformanceLevel.PDF_A_4, new PdfOutputIntent("Custom", "",
                null, "sRGB IEC61966-2.1", inputStream));
            //Taking care of the additional PDF/A requirements
            pdfDocument.GetCatalog().SetLang(new PdfString("nl-nl"));
            pdfDocument.SetTagged();
            PdfDocumentInfo info = pdfDocument.GetDocumentInfo();
            info
                .SetTitle("title")
                .SetAuthor("Author")
                .SetSubject("Subject")
                .SetCreator("Creator")
                .SetKeywords("Metadata, iText, PDF")
                .SetCreator("My program using iText")
                .AddCreationDate();

            ConverterProperties properties = new ConverterProperties();
            FontProvider fontProvider = new DefaultFontProvider();
            FontProgram fontProgram = FontProgramFactory.CreateFont(FONT);
            fontProvider.AddFont(fontProgram);
            properties.SetFontProvider(fontProvider);
            HtmlConverter.ConvertToDocument(new FileStream(HTML, FileMode.Open), pdfDocument, properties);

            pdfDocument.Close();
            inputStream.Close();
        }
    }
}
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.