使用ITextSharp在合并文档中插入PieceInfo [英] Insert PieceInfo in merged document with ITextSharp
问题描述
我有一个将多个PDF合并为一个PDF的过程。这很有效。
在合并时,我想在页面级别添加一个PieceInfo来跟踪包含在该合并文件中的文档。
假设我按此顺序有3个文件:Fester.pdf(2页),Gomez.pdf(2页)和Lurch.pdf(1页)。合并后,我将有5页,每个页面都有一个PieceInfo,文件名来自。这样,如果我转到第4页,我将知道该页面是从Gomez.pdf生成的。
在我的搜索过程中,我发现了这个帖子:
看一下 CustomPageDictKeyCreate 示例,了解如何检索这些自定义标记:
public void check(String filename)抛出IOException {
PdfReader reader = new PdfReader(filename);
PdfDictionary pagedict;
for(int i = 1; i< reader.getNumberOfPages(); i ++){
pagedict = reader.getPageN(i);
System.out.println(pagedict.get(new PdfName(ITXT_PageMarker)));
}
reader.close();
}
请确保使用第二课自定义键的名称。 iText已为ISO定制的第二类密钥注册了前缀 ITXT
。此前缀可确保不同公司不会将相同的密钥用于不同目的。所有以 ITXT
开头的密钥都可以轻松识别为iText Group创建的密钥。 ISO会跟踪所有这些前缀以避免重复。使用ISO注册前缀是免费的。
I have a process that merge several PDFs into a single PDF. This is working great.
At the time of the merge, I want to add a PieceInfo at page level to track the documents that were included into that merged file.
Let's say I have 3 documents in this order: Fester.pdf (2 pages), Gomez.pdf (2 pages) and Lurch.pdf (1 page). After the merge I will have 5 pages and each page would have a PieceInfo with the file name that was originated from. This way, if I go to page 4, I will know the page was generated from Gomez.pdf
During my search, I found this post: Insert hidden digest in pdf using iText library and I tried to implement the same in my process. The suggestion works great but I could not figure out how to store the information per page.
Here is my code:
public static byte[] MergeDocuments(DocumentCollection myCollection)
{
PdfImportedPage importedPage = null;
// Merged the document streams
using (MemoryStream stream = new MemoryStream())
{
// Create the iTextSharp document
iTextSharp.text.Document pdfDoc = new iTextSharp.text.Document();
// Create the PDF writer that listened to the document
PdfCopy pdfCopy = new PdfCopy(pdfDoc, stream);
if (pdfDoc != null && pdfCopy != null)
{
// Open the document and load content
pdfDoc.Open();
//Dictionary Entries
PdfName appName = new PdfName("MyKey");
PdfName dataName = new PdfName("Hash");
//Class to add and retrieve the PieceInfo data
DocumentPieceInfo dpi = new DocumentPieceInfo();
//Loop through my collection. The document class has the BinaryFile and FileName
foreach (Document doc in myCollection)
{
PdfReader reader = new PdfReader(doc.FileBinary);
if (reader != null)
{
int nPage = reader.NumberOfPages;
for (int n = 0; n < nPage; n++)
{
//Trying to add the PieceInfo
dpi.addPieceInfo(pdfCopy, appName, dataName, new PdfString(string.Format("Info Doc: {0}", doc.FileName)));
importedPage = pdfCopy.GetImportedPage(reader, n + 1);
pdfCopy.AddPage(importedPage);
}
// Close the reader
reader.Close();
}
}
if (pdfCopy != null)
pdfCopy.Close();
if (pdfDoc != null)
pdfDoc.Close();
byte[] arrOutput = stream.ToArray();
return arrOutput;
}
}
return null;
}
And a small change to MKL solution, changing the input to a PDFCopy:
public void addPieceInfo(PdfCopy reader, PdfName app, PdfName name, PdfObject value)
{
//PdfDictionary catalog = reader.getCatalog();
PdfDictionary pieceInfo = reader.ExtraCatalog.GetAsDict(PIECE_INFO);
if (pieceInfo == null)
{
pieceInfo = new PdfDictionary();
reader.ExtraCatalog.Put(PIECE_INFO, pieceInfo);
}
PdfDictionary appData = pieceInfo.GetAsDict(app);
if (appData == null)
{
appData = new PdfDictionary();
pieceInfo.Put(app, appData);
}
PdfDictionary privateData = appData.GetAsDict(PRIVATE);
if (privateData == null)
{
privateData = new PdfDictionary();
appData.Put(PRIVATE, privateData);
}
appData.Put(LAST_MODIFIED, new PdfDate());
privateData.Put(name, value);
}
The code above is adding the pieceinfo in the last page only :(
Does the page PdfImportedPage object have a way to get the catalog?
How can I include this information per page level during my merge process? After that, how can I get the pieceInfo from the pages? Just looping through the pages?
Please be aware that /PieceInfo
will be deprecated in ISO-32000-2 (aka PDF 2.0). As an alternative, you can create your own key to add your own custom data. This is explained in my answer to the question itext how to check if giant string is present on the pdf page.
You are asking Does the page PdfImportedPage object have a way to get the catalog?
This is not the right question to ask. If you study my answer well, you'll discover that you need access to the page dictionary. You can add a /PieceInfo
entry (or your custom entry) to this page dictionary and then later retrieve it.
Take a look at the CustomPageDictKeyMerge:
public void createPdf(String filename) throws IOException, DocumentException {
PdfName marker = new PdfName("ITXT_PageMarker");
List<PdfReader> readers = new ArrayList<PdfReader>();
readers.add(new PdfReader(SRC1));
readers.add(new PdfReader(SRC2));
readers.add(new PdfReader(SRC3));
Document document = new Document();
PdfCopy copy = new PdfCopy(document, new FileOutputStream(filename));
document.open();
int counter = 0;
int n;
PdfImportedPage importedPage;
PdfDictionary pageDict;
for (PdfReader reader : readers) {
counter++;
n = reader.getNumberOfPages();
for (int p = 1; p <= n; p++) {
pageDict = reader.getPageN(p);
pageDict.put(marker, new PdfString(String.format("Page %s of document %s", p, counter)));
importedPage = copy.getImportedPage(reader, p);
copy.addPage(importedPage);
}
}
// close the document
document.close();
for (PdfReader reader : readers) {
reader.close();
}
}
In this example, we add a special marker to the page dictionary before we import the page. As a result, this marker will be added to the merged document:
Take a look at the CustomPageDictKeyCreate example to find out how to retrieve these custom markers:
public void check(String filename) throws IOException {
PdfReader reader = new PdfReader(filename);
PdfDictionary pagedict;
for (int i = 1; i < reader.getNumberOfPages(); i++) {
pagedict = reader.getPageN(i);
System.out.println(pagedict.get(new PdfName("ITXT_PageMarker")));
}
reader.close();
}
Please make sure that you use a second class name for your custom key. iText has registered the prefix ITXT
with ISO for its custom second class keys. This prefix makes sure that different companies don't use the same key for different purposes. All keys starting with ITXT
can easily be identified as keys created by iText Group. ISO keeps track of all these prefixes to avoid duplicates. Registration of a prefix with ISO is free of charge.
这篇关于使用ITextSharp在合并文档中插入PieceInfo的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!