如何将搜索目录文件(.pdx)与PDF文档关联 [英] How to associate search catalog file (.pdx) with PDF document
问题描述
使用.NET应用程序,我正在尝试创建一个引用其他文件的PDF目录,例如可以在DVD等上分发的文件。
Using a .NET application, I am trying to create a PDF "table of contents" that references other files, like one would distribute on a DVD etc.
为此,我需要一个搜索索引和目录,因此全文搜索将跨文档工作。
我已经能够通过复制旧.pdx文件(目录结构始终相同)自动构建索引,然后从C#调用JavaScript:
For this purpose, I need a search index and catalog, so full-text search will work across documents. I have been able to automate the construction of the index by copying an "old" .pdx file (the directory structure is always the same) and then calling JavaScript from C#:
var js = $@"catalog.getIndex(""{pdxFilePath}"").build('alert(""Hello"")', true)";
formFields.ExecuteThisJavascript(js);
但是如何将.pdx文件与我的.pdf文档相关联,以便自动加载?
But how can I associate the .pdx file with my .pdf document, so it gets loaded automatically?
在Acrobat中,这是在高级文档属性中设置的:
In Acrobat, this is set in the "advanced" document properties:
但是,无法通过 info
或元数据访问
文档的属性。
显然这是存储在其他地方,但我对PDF格式知之甚少,无法弄清楚如何访问这些数据:
However, this is not accessible via the info
or metadata
properties of the document.
Apparently this is stored somewhere else, but I don't know enough about the PDF format to figure out how to access this data:
任何帮助都将受到高度赞赏。我可以使用Adobe SDK / JavaScript API或其他一些库(例如,我知道我们已经拥有Aspose许可证)。
Any help would be highly appreciated. I could use both the Adobe SDK/JavaScript API or some other library (for instance, I know we already have an Aspose license).
推荐答案
在这里回答我自己的问题......我能够使用 PdfSharp 来解决这个问题。
Answering my own question here... I was able to solve this using PdfSharp.
以下代码与PdfSharp 1.50.4845-RC2a兼容。
The following code is compatible with PdfSharp 1.50.4845-RC2a.
pdxFile
应该是.pdx文件的名称,包括文件扩展名(例如catalog.pdx)。我只使用与PDF文档位于同一文件夹中的.pdx文件对此进行了测试,但我认为相对路径一般应该有效。
pdxFile
should be the name of the .pdx file including the file extension (e.g. "catalog.pdx"). I have only tested this with .pdx files located in the same folder as the PDF document, but I would assume that relative paths in general should work.
不保证这一点这是一个完美的解决方案,因为我对PDF格式缺乏更深入的理解,但这似乎至少起作用。
No guarantees that this is a perfect solution as I lack a deeper understanding of the PDF format, but this seems to work at least.
private void SetSearchCatalog(PdfDocument doc, string pdxFile)
{
var indexDict = new PdfDictionary(doc);
indexDict.Elements["/F"] = new PdfString(pdxFile, PdfStringEncoding.RawEncoding);
indexDict.Elements["/Type"] = new PdfName("/Filespec");
var indexArrayItemDict = new PdfDictionary(doc);
indexArrayItemDict.Elements["/Index"] = indexDict;
indexArrayItemDict.Elements["/Name"] = new PdfName("/PDX");
var indexArray = new PdfArray(doc, indexArrayItemDict);
var searchDict = new PdfDictionary(doc);
searchDict.Elements["/Indexes"] = indexArray;
doc.Internals.Catalog.Elements["/Search"] = searchDict;
}
这篇关于如何将搜索目录文件(.pdx)与PDF文档关联的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!