Skip to main content
Skip table of contents

How To find internal links in a PDF file?

I am using ItextSharp for searching internal links in a PDF file. This is already done with External Links.


//Get the current page
PdfDictionary PageDictionary = R.GetPageN(page);
//Get all of the annotations for the current page
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);
//Make sure we have something
if ((Annots == null) || (Annots.Length == 0)) {
    Console.WriteLine("nothing");
}
//Loop through each annotation
if (Annots != null) {
    foreach (PdfObject A in Annots.ArrayList) {
        //Convert the itext-specific object as a generic PDF object
        PdfDictionary AnnotationDictionary =
            (PdfDictionary)PdfReader.GetPdfObject(A);
        //Make sure this annotation has a link
        if (!AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK))
            continue;
        //Make sure this annotation has an ACTION
        if (AnnotationDictionary.Get(PdfName.A) == null)
            continue;
        //Get the ACTION for the current annotation
        PdfDictionary AnnotationAction =
            AnnotationDictionary.GetAsDict(PdfName.A);
        // Test if it is a URI action (There are tons of other types of actions,
        // some of which might mimic URI, such as JavaScript,
        // but those need to be handled seperately)
        if (AnnotationAction.Get(PdfName.S).Equals(PdfName.URI)) {
            PdfString Destination = AnnotationAction.GetAsString(PdfName.URI);
            string url1 = Destination.ToString();
        }
    }
}

Posted on StackOverflow on Feb 22, 2014 by Ashwani

You've already done most of the work. Please take a look at the following screen shot:

Internal view of the PDF

Internal view of the PDF

You see the /Annots array of a page. You are already parsing that array in your code and you skip all annotations that aren't of the /Subtype /Link or don't have an /A key, which is excellent.

Currently you're only looking for values of /S that are of type /URI. You say you're already done with external links, but that's not true: you should also look for entries where /S is /GoToR (remote goto). If you want internal links, you need to look for /S values equal to /GoTo, /GoToE, and (in the future) /GoToDp. Maybe you also want to remove the /JavaScript actions, because they can also be used to jump to a specific page.

Click this link if you want to see how to answer this question in iText 7?

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.