pdfHTML: Using emojis in iText
Introduction
From their humble beginnings in 1999, emojis have become a staple of digital communication and most document and communication formats support them in one form or another. For us this means of course that they are also supported in the PDF format.
Emojis may give you the impression that they are small images in a traditional sense, but they are actually more closely related to character or symbol glyphs than images: You can select, copy or paste them, adjust their size and more. This also means that they can be represented as Unicode codepoints. For example, the grinning emoji can be represented as the following codepoint: U+1F603
For us this is very convenient because it means we can use escape sequences in our programming language of choice to add these emojis to our document, even though we cannot add the emojis directly. The only consideration we have to keep in mind when adding emojis to the document is the same as when adding non-Roman characters (such as Chinese, Greek or Hindi for example): We need a font program that is able to draw these characters, as they are not included into the PDF document standard fonts.
How it works
When converting HTML files to PDF documents the process to include emojis is simple and straightforward; we need to add the font with the emojis to a FontProvider so it can be provided to the HtmlConverter during conversion. When creating PDF documents directly however, we find out that codepoints made up of more than one byte are not allowed to be directly escaped, so in Java we will need the help of a helper method that splits the code point into 2 escaped characters (surrogate pairs).
Below you can find a sample that shows both approaches: the #fromHtml()
method shows conversion from a HTML file, while the #createEmojiDocument()
method creates a PDF document directly and uses a possible implementation for a helper method to add emojis.
JAVA
package com.itextpdf.samples.sandbox.pdfhtml.it7kb;
import com.itextpdf.html2pdf.ConverterProperties;
import com.itextpdf.html2pdf.HtmlConverter;
import com.itextpdf.io.font.PdfEncodings;
import com.itextpdf.kernel.font.PdfFont;
import com.itextpdf.kernel.font.PdfFontFactory;
import com.itextpdf.kernel.pdf.PdfDocument;
import com.itextpdf.kernel.pdf.PdfWriter;
import com.itextpdf.layout.Document;
import com.itextpdf.layout.element.Paragraph;
import com.itextpdf.layout.element.Text;
import com.itextpdf.layout.font.FontProvider;
import java.io.FileInputStream;
import java.io.IOException;
public class PdfHtml_AddEmoji {
/*IMPORTANT NOTE:
* If the Typography package is present in the dependencies it will be called in this class and require a license file to be able to run.
* */
public static final String base_uri = "src/main/resources/";
public static final String font_path = base_uri + "font/NotoEmoji-Regular.ttf";
public static final String html_path = base_uri + "pdfhtml/it7kb/emoji.html";
public static final String dest_path = "cmpfiles/sandbox/pdfhtml/cmp_emoji.pdf";
public static void main(String[] args) throws IOException {
new PdfHtml_AddEmoji().fromHtml();
}
void fromHtml() throws IOException {
PdfDocument pdf = new PdfDocument(new PdfWriter(dest_path));
ConverterProperties cprop = new ConverterProperties();//ConverterProperties will be used to add the FontProvider to the Converter.
FontProvider provider = new FontProvider();//The FontProvider will hold our emoji font
provider.addFont(font_path, PdfEncodings.IDENTITY_H);//add font path to provider
cprop.setFontProvider(provider);//add provider to properties
HtmlConverter.convertToPdf(new FileInputStream(html_path), pdf, cprop);//Include ConverterProperties as argument for the #convertToPdf() method.
}
public void createEmojiDocument() throws IOException {
PdfDocument pdf = new PdfDocument(new PdfWriter(dest_path));
Document doc = new Document(pdf);
PdfFont emoji_font = PdfFontFactory.createFont(font_path); //Create Pdf Font with Emoji glyphs
Paragraph p = new Paragraph();
p.setFont(emoji_font); //add font to Paragraph
p.setFontSize(15);
p.add(new Text(String.format("Here are some emojis: %s %s", encodeCodepoint(0x1F600), encodeCodepoint(0x1F604)))); //encode unicode values
doc.add(p); //add Paragraph to document
doc.close();
}
private String encodeCodepoint(int codePoint) {
char[] chars = Character.toChars(codePoint);//Create char[] from integer code point
StringBuilder sb = new StringBuilder();
for (char ch : chars) {
sb.append(String.format("\\u%04X", (int) ch)); //append characters in correct format.
}
return sb.toString(); //return codepoint
}
}
C#
using System.IO;
using iText.Html2pdf;
using iText.IO.Font;
using iText.Kernel.Colors;
using iText.Kernel.Events;
using iText.Kernel.Font;
using iText.Kernel.Pdf;
using iText.Kernel.Pdf.Canvas;
using iText.Kernel.Pdf.Extgstate;
using iText.Layout;
using iText.Layout.Element;
using iText.Layout.Font;
using iText.Layout.Properties;
namespace it7ns.samples.sandbox.pdfhtml.it7kb
{
public class PdfHtml_AddEmoji
{
static string base_uri = "../../pdfhtml/it7kb/",
font_path = base_uri + "OpenSansEmoji.ttf",
html_path = base_uri + "emoji.html",
dest = base_uri + "../../cmpfiles/sandbox/pdfhtml/cmp_emoji.pdf";
public static void Main(string[] args)
{
new PdfHtml_AddEmoji().fromHtml();
}
public void fromHtml()
{
PdfDocument
pdf = new PdfDocument(
new PdfWriter(dest)); //we use a PdfDocument Object instead of FileStream to write HTML into
ConverterProperties cprop = new ConverterProperties();
FontProvider provider = new FontProvider();
provider.AddFont(font_path, PdfEncodings.IDENTITY_H);
cprop.SetFontProvider(provider);
pdf.AddEventHandler(PdfDocumentEvent.END_PAGE, new WatermarkEvent()); //here we add custom event handler
HtmlConverter.ConvertToPdf(new FileStream(html_path, FileMode.Open), pdf, cprop);
}
public void createDoc()
{
PdfDocument pdf = new PdfDocument(new PdfWriter(dest));
Document doc = new Document(pdf);
FontProvider provider = new FontProvider();
provider.AddFont(font_path, PdfEncodings.IDENTITY_H);
PdfFont font = PdfFontFactory.CreateFont(font_path, PdfEncodings.IDENTITY_H, true);
font.SetSubset(false);
doc.SetFont(font);
Paragraph p = new Paragraph($"\uD83D\uDE06 This text has an emoji, look! -> {encodeCodePoint(0x1F600)} ");
doc.Add(p);
doc.Close();
}
private string encodeCodePoint(int codepoint)
{
return char.ConvertFromUtf32(codepoint);
}
}
}
In the link included below you’ll find more information related to what Unicode values are and how they are used:
https://unicode.org/faq/utf_bom.html