Skip to main content
Skip table of contents

How to convert HTML containing Arabic/Hebrew characters to PDF?

This is a duplicate of the question Which languages are supported in pdfHTML?. The answer can be found in chapter 6, but this question is asked so frequently that an extra entry in the FAQ section is justified. It's also an occasion to provide an extra example.

In the C07E14_SayPeace (Java/.NET) example, we convert the say_peace.html HTML file to PDF.

Say Peace in HTML

Say Peace in HTML

We see English, Arabic, and Hebrew in this text. We'll use a different font file for each of these languages.

JAVA
public static final String[] FONTS = {
    "src/main/resources/fonts/noto/NotoSans-Regular.ttf",
    "src/main/resources/fonts/noto/NotoNaskhArabic-Regular.ttf",
    "src/main/resources/fonts/noto/NotoSansHebrew-Regular.ttf"
};

We'll create a FontProvider instance that only uses these font files, and we'll use this FontProvider as a converter property.

JAVA
public void createPdf(String src, String[] fonts, String dest) throws IOException {
    ConverterProperties properties = new ConverterProperties();
    FontProvider fontProvider = new DefaultFontProvider(false, false, false);
    for (String font : fonts) {
        FontProgram fontProgram = FontProgramFactory.createFont(font);
        fontProvider.addFont(fontProgram);
    }
    properties.setFontProvider(fontProvider);
    HtmlConverter.convertToPdf(new File(src), new File(dest), properties);
}

The result is a PDF file in which the text is rendered correctly:

Say Peace in PDF

Say Peace in PDF

If you used the appropriate fonts, and you get a different result, in the sense that the Hebrew and Arabic text is rendered from left to right, instead of from right to left, you have forgotten to add the pdfCalligraph add-on to your CLASSPATH.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.