使用iTextSharp在VB.NET中读取PDF书签 [英] Reading PDF Bookmarks in VB.NET using iTextSharp

查看:604
本文介绍了使用iTextSharp在VB.NET中读取PDF书签的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在制作一个扫描PDF文件并在PDF书签和正文中搜索文本的工具。我正在使用Visual Studio 2008和VB.NET以及iTextSharp。

I am making a tool that scans PDF files and searches for text in PDF bookmarks and body text. I am using Visual Studio 2008 with VB.NET with iTextSharp.

如何从现有PDF文件加载书签列表?

How do I load bookmarks' list from an existing PDF file?

推荐答案

这取决于你说书签时的理解。

It depends on what you understand when you say "bookmarks".

你想要的大纲(书签面板中可见的条目):

CreateOnlineTree 示例向您展示如何使用 SimpleBookmark 类创建包含完整大纲树的XML文件(在PDF行话中,书签称为轮廓)。

The CreateOnlineTree examples shows you how to use the SimpleBookmark class to create an XML file containing the complete outline tree (in PDF jargon, bookmarks are called outlines).

Java:

PdfReader reader = new PdfReader(src);
List<HashMap<String, Object>> list = SimpleBookmark.getBookmark(reader);
SimpleBookmark.exportToXML(list,
        new FileOutputStream(dest), "ISO8859-1", true);
reader.close();

C#:

PdfReader reader = new PdfReader(pdfIn);
var list = SimpleBookmark.GetBookmark(reader);
using (MemoryStream ms = new MemoryStream()) {
    SimpleBookmark.ExportToXML(list, ms, "ISO8859-1", true); 
    ms.Position = 0;
    using (StreamReader sr =  new StreamReader(ms)) {
        return sr.ReadToEnd();
    }              
} 

列表 object也可用于以编程方式逐个检查不同的书签元素(这在官方文档中都有解释)。

The list object can also be used to examine the different bookmark elements one by one programmatically (this is all explained in the official documentation).

你想要的命名目的地(您可以按名称链接到文档中的特定位置):

现在假设您打算说出命名目的地,那么您需要 SimpleNamedDestination 类,如 LinkActions 示例:

Now suppose that you meant to say named destinations, then you need the SimpleNamedDestination class as shown in the LinkActions example:

Java:

PdfReader reader = new PdfReader(src);
HashMap<String,String> map = SimpleNamedDestination.getNamedDestination(reader, false);
SimpleNamedDestination.exportToXML(map, new FileOutputStream(dest),
        "ISO8859-1", true);
reader.close();

C#:

PdfReader reader = new PdfReader(src);
Dictionary<string,string> map = SimpleNamedDestination
      .GetNamedDestination(reader, false);
using (MemoryStream ms = new MemoryStream()) {
    SimpleNamedDestination.ExportToXML(map, ms, "ISO8859-1", true);
    ms.Position = 0;
    using (StreamReader sr =  new StreamReader(ms)) {
      return sr.ReadToEnd();
    }
}

地图 object也可用于以编程方式逐个检查不同的命名目标。请注意检索命名目标时使用的 Boolean 参数。可以使用PDF名称对象作为名称或使用PDF字符串对象存储命名目标。 Boolean 参数指示您是希望前者( true =存储为PDF名称对象)还是后者( false =存储为PDF字符串对象)命名目的地的类型。

The map object can also be used to examine the different named destinations one by one programmatically. Note the Boolean parameter that is used when retrieving the named destinations. Named destinations can be stored using a PDF name object as name, or using a PDF string object. The Boolean parameter indicates whether you want the former (true = stored as PDF name objects) or the latter (false = stored as PDF string objects) type of named destinations.

命名目的地是PDF文件中的预定义目标,可以通过他们的名字找到。虽然官方名称是命名目的地,但是有些人也将它们称为书签(但是当我们在PDF的背景下说书签时,我们通常想要参考大纲)。

Named destinations are predefined targets in a PDF file that can be found through their name. Although the official name is named destinations, some people refer to them as bookmarks too (but when we say bookmarks in the context of PDF, we usually want to refer to outlines).

这篇关于使用iTextSharp在VB.NET中读取PDF书签的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆