Release iText 5.2.0
The previous release was iText 5.1.3 (dated 11-11-11, another special day) and we've been working with iText 5.1.4-SNAPSHOT for a long time now, but eventually we decided not to release a version 5.1.4, but to skip to version 5.2.0
Release Notes:
IMPORTANT
All 5.2.x versions have been removed from our servers because of a serious flaw that was introduced when dealing with large PDFs!
We announced a new iText release for March, but as 2012 is a leap year, we decided to release on February 29th. Otherwise we'd have to wait for another four years before we have the chance to release on such a special day.
The previous release was iText 5.1.3 (dated 11-11-11, another special day) and we've been working with iText 5.1.4-SNAPSHOT for a long time now, but eventually we decided not to release a version 5.1.4, but to skip to version 5.2.0.
The philosophy of the version numbers is that you don't have to change any of your existing code when you upgrade when you upgrade from version x.y.z to version x.y.z+1. When we move from version x.y.z to version x.y+1.z, you may need to adapt some of your code.
In this case, you will need to replace your old itext-asian.jar
with a new one, otherwise your code using CJKFont
won't work. You'll find it in extrajars-2.2.zip. You'll also need to apply small changes (nothing more than changing package names) if your application depends on java.awt
classes such as PdfGraphics2D
. We have been experimenting with iText on Google Android and Google App Engine, and we reduced the dependency of iText on java.awt
classes to a minimum.
What else is new?
We focused on two major fields:
iText 5.2.0: better PDF parsing
We received plenty of feedback regarding PDF parsing, and we've taken into account almost all the issues that were reported. This means that PDF to text conversion with iText has now improved dramatically. Soon the Belgian IRS will start using iText to parse thousands of documents looking for a national number on the first page.
We're using different strategies to do this: we parse the text at a specific position if we know it; or we parse the whole page looking for a pattern if the number can be anywhere on the page. We've also improved the parsing of PDF documents in languages such as Chinese, Korean, Japanese,...
XML Worker 1.1.2: better HTML rendering
We received plenty of feedback regarding HTML parsing, and we've taken into account a lot of issues that were reported. This means that HTML to PDF conversion with iText has improved dramatically. We still don't offer URL2PDF conversion. For instance: float
still isn't supported, but version 1.1.2 of the XML Worker does a much better job at converting flowing HTML to PDF.
Besides the two major areas of interest of this release, we also introduced experimental SVG parsing, we filled some gaps regarding PAdES, we now support PDF files of over 2 GB (up to 10 GB for traditional PDFs and up to 1 TB for PDFs with a cross-reference stream), and we fixed some bugs.
Changelog:
IMPORTANT: READ THIS BEFORE YOU UPGRADE!
- If you use CJK fonts in your existing code, you will need to update the itext-asian.jar. You'll find this jar in extrajars-2.2.zip.
- If you use AWT classes such as
AffineTransform
, you should switch to using the classes in packagecom.itextpdf.awt.geom
. - If you use AWT-related classes such as
PdfGraphics2D
in your existing code, you'll have to make a minor change to your code. This class has moved to another package:com.itextpdf.awt
.
iText 5.2.0
- Changes made by Paulo Soares
- Digital signatures: Encapsulation of the basic OCSP response and correction for the CRL inclusion.
- Support for PAdES-LVT timestamp verification.
- Support digests in timestamps other than SHA-1.
- Unification of cmap handling. CJK fonts support all the encodings.
- Support for big PDFs over 2GB; you can now create 10GB PDFs with a classic cross-reference table and PDFs as big as 1TB with a cross-reference stream. (Suggestions by Welman Jordan)
- Added classes to
Map
inLtvTimestamp
(generics). - Replaced
escape
-method inSimpleNamedDestination
- PDF Parsing:
- Made the
getFont()
method inpdfContentStreamProcessor
private - Text extraction with CJK encodings such as GBK-EUC-H is now possible.
- Several fixes when reading documents with fonts using the
/ToUnicode
entry. - Fix for strange numbers such as --234
- Resource dictionaries may have direct fonts.
- Made the
- Changes made by Kevin Day
- PdfReader and related classes:
- Better error messages and better handling zero sized files and attempts to read past the end of the file.
- Removed restriction that using memory mapping requires the file be smaller than ~2GB.
- Avoid
NullPointerException
inRandomAccessFileOrArray
- PDF parsing:
- Made a utility method in
pdfContentStreamProcessor
private and clarified the stateful nature of the class LocationTextExtractionStrategy
: bounds checking on string lengths and refactoring to make code easier to read.- Better handling of color space dictionaries in images.
- improve handling of quasi improper inline image content.
- don't decode inline image streams until we absolutely need them.
- avoid
NullPointerException
of resource dictionary isn't provided.
- Made a utility method in
- PdfReader and related classes:
- Changes made by Eugene Markovskyi
FontWeight
is added to font descriptor ofDocumentFont
.- Bugfix
PRAcroForm
: avoidNullPointerException
- Bugfix
ColumnText
:Image
position should be shifted on descent of previous line. - Bugfix
BidiLine
: Taking into account percentage width ofLineSepartor
.
- Changes made by Alexander Chingarev
PdfName
: Added FontFamily tag.XfaForm
: Fixed bug in XFA forms filling.
- Changes made by Bruno
- Making iText more fool-proof: it's forbidden to construct, stroke or fill paths inside a text object.
- AWT-related changes to simplify creating the Android/GAE port of iText:
- Bugfix by Ivan Farkas: Avoiding a
NullPointerException
inPdfStamperImp
- Moved AWT related methods to the bottom of the source code of several class files (
PdfContentByte
,Barcode
,Image
,PdfImageObject
). - Introduce Apache Harmony classes in a package
com.itextpdf.awt.geom
. - Removed several dependencies on AWT classes such as
java.awt.Rectangle
andjava.awt.AffineTransform
. - Moved
PdfGraphics2D
,FontMapper
, and related classes to packagecom.itextpdf.awt
.
- Bugfix by Ivan Farkas: Avoiding a
- PDF Parsing: The
RegionTextRenderFilter
now works withcom.itextpdf.text.Rectangle
. - PDF Parsing: It doesn't make sense to take zero length text into account; change made after Adam Read reported a StringIndexOutOfBoundsException on the mailing list (December 5, 2011).
PdfConcatenate
: removing aSystem.out.println()
(originally added for debugging).- Suggestion by Martin Pallmann to move the IllegalArgumentException out of the try/catch in
ICC_Profile
.
XML Worker 1.1.2
- Changes made by Balder Van Camp
- Fix indentation of Ordered Lists, list are set to autoindent if they are ordered; otherwise the numbering would overwrite the listitems text (bug reported by Stephen Bell on the mailinglist for itext C# version, proposal of a fix by Keith O adapted and added).
- Some javadoc fixes.
- Create abstraction for CssAppliers allowing developers to write their own CssAppliers class and in turn write their own CssApplier. The CssAppliers.getInstance() method has been removed in favor of injection into tag processors through CssAppliersAware interface. Then injection is done in the HtmlPipeline. And a custom CssAppliers implementation can be set in the HtmlPipelineContext. If it's not set the default CssAppliersImpl is used. This code change should not affect users but can affect classes that override current implementations.
- Remove quotes from fontfamily names ( based on http://itext-general.2136553.n4.nabble.com/XMLWorker-HTML-to-PDF-problem-with-external-css-td4373089.html )
- Changes made by Jeroen Nouws
- Removed bug causing XMLWorker to crash when trying to parse Headers inside TableData.
- Changes made by Eugene Markovskyi
- Using UNDEFINED value as default for font and color properties. Default leading is NaN.
- Applying font properties to paragraph
- The logic of max leading of paragraph is disabled (iText based logic should use multiplier leading of
Paragraph
). - Clean up default margin properties for correct merging of paragraph css styles and para element attributes.
- Separate method for applying of font dependent CSS styles.
- Fixed uppercase/lowercase problems (using
equalsIgnoreCase()
and introducing the methodCssUtils.stripDoubleSpacesTrimAndToLowerCase()
) - Fixed RuntimeExcpetion - better handling of invalid nested tags
- Improved parsing of processing instructions.
- FontFactory versus FontProvider: introduced
XMLWorkerFontProvider
- Introduction of the class
LineHeightCalculator
. - Introduction of
"valign"
and"align"
in theHTML
class. - Fixed several alignment, row height and line size issues in table cells.
- Fixed issues with font styles that weren't applied correctly.
- Fixed page break issues.
- ColumnText does not support resizing of image height. If a cell has a fixed height, an image with larger height disappeared from this cell.
- Fixed white space issues.
- Several code optimalisations.
- Changes made by Bruno Lowagie
- Introduced experimental code to parse SVG to a PdfTemplate. This works for tiger.svg, but it certainly doesn't work for all SVG files yet. This is a code contribution by VVB who wants to remain anonymous. The code was slightly adapted by Bruno.
- Removed dependencies on java.awt.
- When a tag like this is encountered:
: XML Worker tries to make an absolute value of the width. This has now been fixed.