How to remove text from a PDF?
I want the text to be removed, not merely covered.
Posted on StackOverflow on jan 12, 2015 by amar
Please take a look at the RemoveContentInRectangle example.
Let's say we have the following page:
Original PDF
Now we want to remove all the text in the rectangle defined by the coordinates: llx = 97, lly = 405, urx = 480, ury = 445]
(where ll
stands for lower-left and ur
stands for upper-right).
We can now use the following code:
public void manipulatePdf(String src, String dest)
throws IOException, DocumentException {
PdfReader reader = new PdfReader(src);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(dest));
List cleanUpLocations =
new ArrayList();
cleanUpLocations.add(new PdfCleanUpLocation(
1, new Rectangle(97f, 405f, 480f, 445f), BaseColor.GRAY));
PdfCleanUpProcessor cleaner =
new PdfCleanUpProcessor(cleanUpLocations, stamper);
cleaner.cleanUp();
stamper.close();
reader.close();
}
As you see, we define a list of PdfCleanUpLocation
objects. To this list, we add a PdfCleanUpLocation
passing the page number, a Rectangle
defining the area we want to clean up, and a color that will show the area where content has been removed.
We then pass this list of PdfCleanUpLocation
s to the PdfCleanUpProcessor
along with the PdfStamper
instance. We invoke the cleanUp()
method and when we close the PdfStamper
instance, we get the following result:
Text in gray area has been completely removed
You can inspect this file: you will no longer be able to select any text in the gray area. All the text inside that rectangle has been removed. Note that this code sample will only work if you add the itext-xtra.jar to your CLASSPATH (itext-xtra is shipped with iText core). It will only work with versions equal to or higher than iText 5.5.4.