The pdf2Data Editor allows you to use the expert mode for selectors; so-called because it gives you extra flexibility, but also requires extended knowledge to build an extraction pipeline.
We assume you know how to edit the data field in the expert mode.
Expert mode selectors
There are a few pdf2Data selectors which are exclusively available in the expert mode.
Table frequency selector
tableFreq: selectCell=1;2, selectRow=1:2, selectColumn=2:2
Uses text frequency analysis to detect table cells and might work better than the default
Table selector for borderless tables.
selectColumn are optional, and specify the row and column numbers (or ranges using
start:end syntax), if only a part of the table needs to be extracted.
Grouping is used to structure the XML output by combining the detected data fields into groups.
FIELD_NAMEis a name of any other field in the template
This selector results in all instances of the current data field being placed inside the preceding (vertically top to bottom) data field
Please see the article about that to know more.
Font size selector (expert)
fontSize: minSize=X, maxSize=Y
Unlike the standard Font size selector, it selects all characters with a font size between
maxSize parameters are present, and the font size of the text inside the field region is ignored.
All pdf2Data selectors can be used in expert mode with special keywords, and some of them also allow you to specify additional parameters that affect accuracy. Please see a particular selector page to get insight on how to use it in expert mode.