如何在iOS(iPad)中获取PDF目录(大纲)数据? [英] How to obtain PDF table of contents (outline) data in iOS (iPad)?

查看:874
本文介绍了如何在iOS(iPad)中获取PDF目录(大纲)数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个显示PDF的iPad应用程序,我希望能够显示目录并让用户导航到相关页面。



<在这一点上,我已经投入了几个小时的研究,看来由于PDFKit [在iOS中不支持],我唯一的选择是手动解析PDF元数据。



我已经看过几个解决方案,但所有这些解决方案都是沉默的 - 如何将大纲元数据中的页面与实际页面编号相关联项目。我用[偷窥工具]检查了我的PDF文档,我可以看到树中的轮廓。



[此解决方案]帮助我弄清楚如何向下导航大纲/ A / S / D树找到Dest对象,但它使用我不理解的[self.pages indexOfObjectIdenticalTo:destPageDic]执行某种对象比较。



我已阅读[adobe的官方PDF规范],12.3.2.3命名目的地部分描述了大纲条目指向页面的方式:



151中显示的显式语法不是直接用
定义,而是通过名称对象间接地将目的地引用到

(PDF 1.1)或字节字符串(PDF 1.2)。


继续使用这条完全不可理解的行我:


此条目的值应为
字典,其中每个键为
目标名称和相应的
值是定义
目标的数组,使用
表151中显示的语法,或者是带有D
条目的字典,其值是这样的数组。


这是指第366页,12.3.2.2显式目的地,其中一个表描述了一个页面:在每种情况下, page是对页面对象的间接引用



因此,CGPDFDocumentGetPage或CGPDFPageGetDictionary的结果是对页面对象的间接引用?



我发现讨论了[lists.apple.com]上的[thread]。 [此注释]意味着您可以比较给定页面的CGPDFPageGetDictionary对象的地址(在内存中?),并将其与PDF元数据的大纲树中的页面进行比较。



然而,当我在Outline树中查看页面对象的地址并将它们与地址进行比较时,它们永远不会相同。该线程中使用的行TTDPRINT(@​​%d =>%p,k + 1,dict);打印dict作为内存中的指针..没有理由相信那里返回的对象与其他地方返回的对象相同......它们会在内存中的不同位置!



我最后的希望是从苹果的命令行大纲工具[在本书中提到](作为[此线程的建议])查看源代码,但我无法在任何地方找到它。



底线 - 是否有人对PDF轮廓的工作方式有所了解,或者知道一些读取PDF轮廓的开源代码(最好是Objective-c)?



ARGG:我在这里发布了各种链接,但显然新用户一次只能发布一个链接

解决方案

CGPDFDocumentGetPage的结果与解析大纲项目中的目标时获得的间接页面引用相同。两者基本上都是字典,你可以使用==来比较它们。如果你有一个想知道页码的CGPDFDictionaryRef,你可以这样做:

  CGPDFDocumentRef doc =。 ..; 
CGPDFDictionaryRef outlinePageRef = ...;
for(int p = 1; p< = CGPDFDocumentGetNumberOfPages(doc); p ++){
CGPDFPageRef page = CGPDFDocumentGetPage(doc,p);
if(page == outlinePageRef){
printf(找到页码:%i,p);
休息;
}
}

显式目的地不是页面,而是第一个元素是页面的数组。其他元素是页面上的滚动位置等。


I am building an iPad application that displays PDFs, and I'd like to be able to display the table of contents and the let user navigate to the relevant pages.

I have invested several hours in research at this point, and it appears that since PDFKit is [not supported in iOS], my only option is to parse the PDF meta data manually.

I have looked at several solutions, but all of them are silent on one point - how to associate a page in the "outline" metadata with the real page number of the item. I have examined my PDF document with [the Voyeur tool] and I can see the outline in the tree.

[This solution] helped me figure out how to navigate down the Outline/A/S/D tree to find the "Dest" object, but it performs some kind of object comparison using [self.pages indexOfObjectIdenticalTo:destPageDic] that I don't understand.

I have read the [official PDF spec from adobe], and section "12.3.2.3 Named Destinations" describes the way that an outline entry can point to a page:

Instead of being defined directly with the explicit syntax shown in Table 151, a destination may be referred to indirectly by means of a name object (PDF 1.1) or a byte string (PDF 1.2).

And continues with this line which is utterly incomprehensible to me:

The value of this entry shall be a dictionary in which each key is a destination name and the corresponding value is either an array defining the destination, using the syntax shown in Table 151, or a dictionary with a D entry whose value is such an array.

This refers to page 366, "12.3.2.2 Explicit Destinations" where a table describes a page: "In each case, page is an indirect reference to a page object"

So is the result of CGPDFDocumentGetPage or CGPDFPageGetDictionary an "indirect reference to a page object"?

I found a [thread on lists.apple.com] that discusses. [This comment] implies that you can compare the address (in memory?) of a CGPDFPageGetDictionary object for a given page and compare it to the pages in the "Outline" tree of the PDF meta data.

However, when I look at the address of page objects in the Outline tree and compare them to addresses they are never the same. The line used in that thread "TTDPRINT(@"%d => %p", k+1, dict);" is printing "dict" as a pointer in memory.. there's no reason to believe that an object returned there would be the same as one returned somewhere else.. they'd be in different places in memory!

My last hope was to look at the source code from apple's command line "outline" tool [mentioned in this book] (as [suggested by this thread]), but I can't find it anywhere.

Bottom line - does anyone have some insight into how PDF outlines work, or know of some open source code (preferably objective-c) that reads PDF outlines?

ARGG: I had all kinds of links posted here, but apparently a new user can only post one link at a time

解决方案

The result of CGPDFDocumentGetPage is the same as an indirect page reference you get when resolving a destination in an outline item. Both are essentially dictionaries and you can compare them using ==. When you have a CGPDFDictionaryRef that you want to know the page number of, you can do something like this:

CGPDFDocumentRef doc = ...;
CGPDFDictionaryRef outlinePageRef = ...;
for (int p=1; p<=CGPDFDocumentGetNumberOfPages(doc); p++) {
  CGPDFPageRef page = CGPDFDocumentGetPage(doc, p);
  if (page == outlinePageRef) {
    printf("found the page number: %i", p);
    break;
  }
}

An explicit destination however is not a page, but an array with the first element being the page. The other elements are the scroll position on the page etc.

这篇关于如何在iOS(iPad)中获取PDF目录(大纲)数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆