pdfHTML: Support for overflow-wrap, word-break CSS Properties
Background:
Among the wide array of feature additions and bug fixes arriving with iText Core 7.1.14 and pdfHTML 3.0.3 comes the addition of support for the overflow-wrap and word-break CSS properties. Both properties serve to determine whether and how line-breaks will be inserted within text elements in one's HTML.
overflow-wrap
Background:
Previously referred to as "word-wrap," overflow-wrap is a CSS property used to specify whether a browser may break lines in between words in order to prevent text which overflows its contained bounds. In the following section, we will demonstrate the different features of overflow-wrap, then provide an HTML code snippet, a Java/C# code sample, and an output PDF document for testing purposes.
Feature demonstration:
The overflow-wrap: normal; setting asserts that line breaks shall only occur at natural moments (i.e. where there is space between two words). As such, long words like those below will overflow outside of their bounds.
The overflow-wrap: break-word; setting breaks any words which overflow, regardless of whether they have an acceptable breaking point within a line.
The overflow-wrap: anywhere; setting allows for line breaks within overflowing words if there are no otherwise acceptable breaking points within a line.
The overflow-wrap: break-all; extends the functionality of break-word to insert a line break before any character which overflows its text area's margin.
Finally, overflow-wrap: hyphens; includes the functionality of break-all, but inserts hyphens wherever there are line-breaks (like a novel).
Reproducible Example:
Below is the source HTML for the above examples:
overflow-wrap HTML + CSS Sample
<!DOCTYPE html>
<html lang="en">
<style>
.body {
font-size: 12;
}
p {
width: 400px;
margin: 2px;
border: 1px solid red;
}
.ow-anywhere {
overflow-wrap: anywhere;
}
.ow-break-word {
overflow-wrap: break-word;
}
.word-break {
word-break: break-all;
}
.hyphens {
hyphens: auto;
}
.wrap-anywhere {
width: 30px;
}
</style>
<head>
<meta charset="UTF-8">
<title>Title</title>
</head>
<body>
<p class="label">1. <i>overflow-wrap: normal;</i></p>
<p>I frolick happily down by the sea shore.
I <i class="normal">Supercalifragilisticexpialidociouslydociouslydociouslydociouslydociously</i>,
think my crabs are over-cooked.</p>
<br>
<p class="label">2. <i>overflow-wrap: break-word;</i></p>
<p>I frolick happily down by the sea shore.
I <i class="ow-break-word">Supercalifragilisticexpialidociouslydociouslydociouslydociouslydociously</i>,
think my crabs are over-cooked.</p>
<br>
<p class="label">3. <i>overflow-wrap: anywhere;</i></p>
<p class="wrap-anywhere ow-anywhere">I frolick happily down by the sea shore.
I
think my crabs are overcooked.</p>
<br>
<p class="label">4. <i>overflow-wrap: break-all;</i></p>
<p>I frolick happily down by the sea shore.
I <i class="word-break">Supercalifragilisticexpialidociouslydociouslydociouslydociouslydociously</i>,
think my crabs are over-cooked.</p>
<br>
<p class="label">5. <i>overflow-wrap: hyphens;</i></p>
<p>I frolick happily down by the sea shore.
I <i class="hyphens">Supercalifragilisticexpialidociouslydociouslydociouslydociouslydociously</i>,
think my crabs are over-cooked.</p>
<br>
</body>
</html>
The following is the code snippet for converting the above HTML snippet to PDF:
Below is our resultant PDF document:
word-break
Background:
The word-break property is utilized to specify how line breaks are inserted within text elements. In the following section, we will demonstrate the different features of overflow-wrap, then provide an HTML snippet, a Java/C# code sample as well as an outputted PDF document generated with pdfHTML.
Feature demonstration:
The word-break: normal; property preserves standard line break rules.
The word-break: break-all; property inserts line breaks before any character which overflows its area.
The word-break: keep-all; applies the normal word-break setting to all texts, including CJK (Chinese, Japanese, Korean) characters.
The word-break: break-word; property has the same functionality of the overflow-wrap: anywhere; functionality.
Reproducible Example:
Below is the source HTML for the above examples:
word-break HTML + CSS Sample
<!DOCTYPE html>
<html lang="en">
<style>
p {
font-family: Arial, Noto Sans JP Medium;
}
.narrow {
padding: 15px;
border: 1px solid red;
width: 400px;
margin: 0 0;
font-size: 20px;
line-height: 1.5;
letter-spacing: 2px;
}
.label {
font-size: 30px;
}
.normal {
word-break: normal;
}
.breakAll {
word-break: break-all;
}
.keepAll {
word-break: keep-all;
}
.breakWord {
word-break: break-word;
}
</style>
<head>
<meta charset="UTF-8">
</head>
<body>
<p class="label">1. <i>word-break: normal</i> </p>
<p class="normal narrow">
Hippopotomonstrosesquippedaliophobia Pneumonoultramicroscopicsilicovolcanoconiosis Supercalifragilisticexpialidociousdociousdociousdociousdocious ?????????????????????????????
</p>
<br>
<p class="label">2. <i>word-break: break-all</i></p>
<p class="breakAll narrow">
Hippopotomonstrosesquippedaliophobia Pneumonoultramicroscopicsilicovolcanoconiosis Supercalifragilisticexpialidociousdociousdociousdociousdocious ?????????????????????????????
</p>
<br>
<p class="label">3. <i>word-break: keep-all</i></p>
<p class="keepAll narrow">
Hippopotomonstrosesquippedaliophobia Pneumonoultramicroscopicsilicovolcanoconiosis Supercalifragilisticexpialidociousdociousdociousdociousdocious ?????????????????????????????
</p>
<br>
<p class="label">4. <i>word-break: break-word</i></p>
<p class="breakWord narrow">
Hippopotomonstrosesquippedaliophobia Pneumonoultramicroscopicsilicovolcanoconiosis Supercalifragilisticexpialidociousdociousdociousdociousdocious ?????????????????????????????
</p>
<br>
</body>
</html>
Below is the code snippet for converting the above HTML snippet to PDF. In order to correctly render the Asian characters in our HTML, we embed the NotoSansJP-medium font (download link).
Below is the outputted PDF: