What is the difference between getPageLabels and getPageLabelFormats?
I have a program that calls PdfPageLabels.getPageLabels()
and PdfPageLabels.getPageLabelFormats()
on the same PdfReader
object on successive lines of my code:
PdfPageLabels.PdfPageLabelFormat[] pplf = PdfPageLabels.getPageLabelFormats(reader); String[] labs = PdfPageLabels.getPageLabels(reader);
I have an example. It's a 150Mb PDF file which appears to have 4670 labels via getPageLabels()
, but only 1 via getPageLabelFormats()
. So my question is: Under what circumstances could the two calls return arrays of different lengths?
Posted on StackOverflow on Dec 3, 2015 by paulb
The difference between both methods is simple:
-
getPageLabels()
returns the label of every page in an array. If your PDF has 4670 pages, you will get an array with 4670String
values. -
getPageLabelFormat()
returns an array with the formats that are used in the document. It doesn't returnString
values, butPdfPageLabelFormat
instances. In many cases, there is only one page label format used throughout the document.
For example:
You have a document with an intro of five pages, numbered i, ii, iii, iv and v. Then you have a hundred pages, numbers 1 to 100.
In this case, getPageLabels()
should return an array with 105 String
values. The getPageLabelFormat()
method however, will only return two PageLabelFormat
values because we are only using two page label formats:
-
one saying that the first physical page starts with lowercase roman numbers starting with i.
-
one saying that the sixth physical page starts with arabic numbers, starting with 1.
Only the start format is needed, physical page 2 to 4 have the same format as physical page 1; physical page 7 to 105 have the same format as page 6.