Skip to main content
Skip table of contents

Does my HTML have to be valid XML?

If you are still using iText 5 and XML Worker, you have to provide XHTML. For instance: a single <br> wasn't allowed in your HTML; you needed to have a  <br />. All tags needed to be closed. Nesting of tags needed to be done correctly. To solve this problem when confronted with incomplete HTML syntax, we advised the use of jsoup to tidy up the HTML before converting it to PDF with XML Worker.

This is no longer necessary with pdfHTML. We have integrated jsoup into the pdfHTML add-on, so that you don't need to call it separately. All HTMLs are cleaned up before converting them to PDF. Take for example the incomplete.html HTML file:

<head><title>Test incomplete HTML</title></head>
<p>Hello World
<p>Hello Universe
<img src="img/logo.png" alt="iText logo">

It doesn't have any  <body>  tags, the  <h1><p><br> , and  <img>  tags are never closed. This is a mighty incomplete HTML file, but a browser renders it anyway, and so does pdfHTML.

Incomplete HTML rendered in a browser and as PDF

Incomplete HTML rendered in a browser and as PDF

You can try this for yourself by running the C07E07_IncompleteHTML (Java/.NET) example.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.