iText 5

How to use a text extraction strategy after applying a location extraction strategy?

I used the following code to get data in PDF from a particular location.

Rectangle rect = new Rectangle(0,0,250,250); RenderFilter filter = new RegiontextRenderFilter(rect); fontBasedTextExtractionStrategy strategy = new fontBasedTextExtractionStrategy(); strategy = new FilteredTextRenderListener(new LocationTextExtractionStrategy(), filter); //Throws Error.

I want to get the bold text present in that location. Would creating a new method or class called FontBasedTextExtractionStrategy instead of a simple TextExtractionStrategy help?

Posted on StackOverflow on Jul 1, 2014 by Raka


Please take a look at the ParseCustom example. In this example, we create a custom RenderFilter (not a TextExtractionStrategy):

##GITHUB:https://github.com/itext/i5js-sandbox/blob/master/src/main/java/sandbox/parse/ParseCustom.java##

This text will filter all text so that only text of which the Postscript font name ends with Bold or Oblique.

This is how you use this filter:

Java
public void parse(String filename) throws IOException {
    PdfReader reader = new PdfReader(filename);
    Rectangle rect = new Rectangle(36, 750, 559, 806);
    RenderFilter regionFilter = new RegionTextRenderFilter(rect);
    FontRenderFilter fontFilter = new FontRenderFilter();
    TextExtractionStrategy strategy = new FilteredTextRenderListener(
            new LocationTextExtractionStrategy(), regionFilter, fontFilter);
    System.out.println(PdfTextExtractor.getTextFromPage(reader, 1, strategy));
    reader.close();
}

As you can see, we create a FilteredTextRenderListener that takes two filters, a RegionTextRenderFilter and our self-made filter based on the font.

Click this link if you want to see how to answer this question in iText 7.