如何使用 iTextSharp 在 pdf 文件中获取部分目标页码? [英] how do i get section target page number in pdf file using iTextSharp?

查看:28
本文介绍了如何使用 iTextSharp 在 pdf 文件中获取部分目标页码?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 pdf 文件,其中包含包含目标页面部分的索引页.我可以获得章节名称(第 1.1 节,第 5.2 节),但我无法获取目标页码...

I have a pdf file which contains Index Page that includes section with target page. I could get the section name(Section 1.1, Section 5.2) but i can not get the target page number...

例如:http://www.mikesdotnetting.com/Article/84/iTextSharp-Links-和-书签

这是我的代码:

string FileName = AppDomain.CurrentDomain.BaseDirectory + "TestPDF.pdf";
PdfReader pdfreader = new PdfReader(FileName);
PdfDictionary PageDictionary = pdfreader.GetPageN(9);
PdfArray Annots = PageDictionary.GetAsArray(PdfName.ANNOTS);       
if ((Annots == null) || (Annots.Length == 0))
    return;

foreach (PdfObject oAnnot in Annots.ArrayList)
{
    PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);          

    if (AnnotationDictionary.Keys.Contains(PdfName.A))
    {
        PdfDictionary oALink = AnnotationDictionary.GetAsDict(PdfName.A);

        if (oALink.Get(PdfName.S).Equals(PdfName.GOTO))
        {
            if (oALink.Keys.Contains(PdfName.D))
            {
                PdfObject objs = oALink.Get(PdfName.D);
                if (objs.IsString())
                {
                    string SectionName = objs.ToString(); // here i could see the section name...
                }
            }
        }
    }
}

如何获取目标页码?

我也无法访问某些 pdf 的部分名称,例如:http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf

also I couldn't access the Section name for some pdf ex: http://wwwimages.adobe.com/www.adobe.com/content/dam/Adobe/en/devnet/pdf/pdfs/adobe_supplement_iso32000.pdf

在此 PDF 中,第 9 页包含我无法获取的部分.所以请给我解决方案....

In this PDF 9th page contains a section I could not get the section. so please give me solution....

推荐答案

有两种可能的链接注释类型,ADest.A 是更强大的类型,但通常是矫枉过正.Dest 类型仅指定对页面的间接引用以及一些拟合和缩放选项.

There's two possible types of Link Annotations, either A or Dest. The A is the more powerful type but is often overkill. The Dest type just specifies an indirect reference to a page along with some fitting and zooming options.

Dest 值可以是几个不同的东西,但通常(据我所见)是一个命名的字符串目的地.您可以在文档的名称目的地字典中查找命名目的地.所以在你的主循环之前添加这个以便以后可以引用它:

The Dest value can be a couple of different things but is usually (as far as I've ever seen) a named string destination. You can look up named destinations in the document's name destination dictionary. So before your main loop add this so that it can be referenced later:

//Get all existing named destinations
Dictionary<string, PdfObject> dests = pdfreader.GetNamedDestinationFromStrings();

一旦您将 Dest 作为字符串,您就可以将该对象视为上述字典中的键.

Once you've got the Dest as a string you can look that object up as a key in the above dictionary.

PdfArray thisDest = (PdfArray)dests[AnnotationDictionary.GetAsString(PdfName.DEST).ToString()];

返回的数组中的第一项是您习惯的间接引用.(实际上,第一项可能是代表远程文档中页码的整数,因此您可能需要检查它.)

The first item in the array returned is the indirect reference that you're used to. (Actually, the first item could be an integer representing a page number in a remote document so you might have to check for that.)

PdfIndirectReference a = (PdfIndirectReference)thisDest[0];
PdfObject thisPage = PdfReader.GetPdfObject(a);

下面是将上述大部分内容放在一起的代码,省略了您已有的一些代码.ADest 根据规范是互斥的,因此任何注释都不应该同时指定.

Below is code that puts most of the above together, omitting some of the code that you already have. A and Dest are mutually exclusive per the spec so no annotation should ever have both specified.

//Get all existing named desitnations
Dictionary<string, PdfObject> dests = pdfreader.GetNamedDestinationFromStrings();

foreach (PdfObject oAnnot in Annots.ArrayList) {
    PdfDictionary AnnotationDictionary = (PdfDictionary)PdfReader.GetPdfObject(oAnnot);

    if (AnnotationDictionary.Get(PdfName.SUBTYPE).Equals(PdfName.LINK)) {
        if (AnnotationDictionary.Contains(PdfName.A)) {
            //...Do normal A stuff here
        } else if (AnnotationDictionary.Contains(PdfName.DEST)) {
            if (AnnotationDictionary.Get(PdfName.DEST).IsString()) {//Named-based destination
                if (dests.ContainsKey(AnnotationDictionary.GetAsString(PdfName.DEST).ToString())) {//See if it exists in the global name dictionary
                    PdfArray thisDest = (PdfArray)dests[AnnotationDictionary.GetAsString(PdfName.DEST).ToString()];//Get the destination
                    PdfIndirectReference a = (PdfIndirectReference)thisDest[0];//TODO, this could actually be an integer for the case of Remote Destinations
                    PdfObject thisPage = PdfReader.GetPdfObject(a);//Get the actual PDF object
                }
            } else if(AnnotationDictionary.Get(PdfName.DEST).IsArray()) {
                //Technically possible, I think the array matches the code directly above but I don't have a sample PDF
            }
        }
    }
}

这篇关于如何使用 iTextSharp 在 pdf 文件中获取部分目标页码?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆