使用ITextSharp在合并文档中插入PieceInfo [英] Insert PieceInfo in merged document with ITextSharp

查看:133
本文介绍了使用ITextSharp在合并文档中插入PieceInfo的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个将多个PDF合并为一个PDF的过程。这很有效。



在合并时,我想在页面级别添加一个PieceInfo来跟踪包含在该合并文件中的文档。



假设我按此顺序有3个文件:Fester.pdf(2页),Gomez.pdf(2页)和Lurch.pdf(1页)。合并后,我将有5页,每个页面都有一个PieceInfo,文件名来自。这样,如果我转到第4页,我将知道该页面是从Gomez.pdf生成的。



在我的搜索过程中,我发现了这个帖子:



看一下 CustomPageDictKeyCreate 示例,了解如何检索这些自定义标记:

  public void check(String filename)抛出IOException {
PdfReader reader = new PdfReader(filename);
PdfDictionary pagedict;
for(int i = 1; i< reader.getNumberOfPages(); i ++){
pagedict = reader.getPageN(i);
System.out.println(pagedict.get(new PdfName(ITXT_PageMarker)));
}
reader.close();
}

请确保使用第二课自定义键的名称。 iText已为ISO定制的第二类密钥注册了前缀 ITXT 。此前缀可确保不同公司不会将相同的密钥用于不同目的。所有以 ITXT 开头的密钥都可以轻松识别为iText Group创建的密钥。 ISO会跟踪所有这些前缀以避免重复。使用ISO注册前缀是免费的。


I have a process that merge several PDFs into a single PDF. This is working great.

At the time of the merge, I want to add a PieceInfo at page level to track the documents that were included into that merged file.

Let's say I have 3 documents in this order: Fester.pdf (2 pages), Gomez.pdf (2 pages) and Lurch.pdf (1 page). After the merge I will have 5 pages and each page would have a PieceInfo with the file name that was originated from. This way, if I go to page 4, I will know the page was generated from Gomez.pdf

During my search, I found this post: Insert hidden digest in pdf using iText library and I tried to implement the same in my process. The suggestion works great but I could not figure out how to store the information per page.

Here is my code:

public static byte[] MergeDocuments(DocumentCollection myCollection)
{
    PdfImportedPage importedPage = null;

    // Merged the document streams
    using (MemoryStream stream = new MemoryStream())
    {
        // Create the iTextSharp document
        iTextSharp.text.Document pdfDoc = new iTextSharp.text.Document();

        // Create the PDF writer that listened to the document
        PdfCopy pdfCopy = new PdfCopy(pdfDoc, stream);
        if (pdfDoc != null && pdfCopy != null)
        {
            // Open the document and load content
            pdfDoc.Open();

            //Dictionary Entries
            PdfName appName = new PdfName("MyKey");
            PdfName dataName = new PdfName("Hash");

            //Class to add and retrieve the PieceInfo data
            DocumentPieceInfo dpi = new DocumentPieceInfo();

            //Loop through my collection. The document class has the BinaryFile and FileName
            foreach (Document doc in myCollection)
            {
                PdfReader reader = new PdfReader(doc.FileBinary);
                if (reader != null)
                {
                    int nPage = reader.NumberOfPages;
                    for (int n = 0; n < nPage; n++)
                    {
                        //Trying to add the PieceInfo
                        dpi.addPieceInfo(pdfCopy, appName, dataName, new PdfString(string.Format("Info Doc: {0}", doc.FileName)));
                        importedPage = pdfCopy.GetImportedPage(reader, n + 1);
                        pdfCopy.AddPage(importedPage);
                    }
                    // Close the reader
                    reader.Close();
                }
            }

            if (pdfCopy != null)
                pdfCopy.Close();

            if (pdfDoc != null)
                pdfDoc.Close();

            byte[] arrOutput = stream.ToArray();
            return arrOutput;

        }
    }
    return null;
}

And a small change to MKL solution, changing the input to a PDFCopy:

public void addPieceInfo(PdfCopy reader, PdfName app, PdfName name, PdfObject value)
    {
        //PdfDictionary catalog = reader.getCatalog();
        PdfDictionary pieceInfo = reader.ExtraCatalog.GetAsDict(PIECE_INFO);
        if (pieceInfo == null)
        {
            pieceInfo = new PdfDictionary();
            reader.ExtraCatalog.Put(PIECE_INFO, pieceInfo);
        }

        PdfDictionary appData = pieceInfo.GetAsDict(app);
        if (appData == null)
        {
            appData = new PdfDictionary();
            pieceInfo.Put(app, appData);
        }

        PdfDictionary privateData = appData.GetAsDict(PRIVATE);
        if (privateData == null)
        {
            privateData = new PdfDictionary();
            appData.Put(PRIVATE, privateData);
        }

        appData.Put(LAST_MODIFIED, new PdfDate());
        privateData.Put(name, value);
    }

The code above is adding the pieceinfo in the last page only :(

Does the page PdfImportedPage object have a way to get the catalog?

How can I include this information per page level during my merge process? After that, how can I get the pieceInfo from the pages? Just looping through the pages?

解决方案

Please be aware that /PieceInfo will be deprecated in ISO-32000-2 (aka PDF 2.0). As an alternative, you can create your own key to add your own custom data. This is explained in my answer to the question itext how to check if giant string is present on the pdf page.

You are asking Does the page PdfImportedPage object have a way to get the catalog?

This is not the right question to ask. If you study my answer well, you'll discover that you need access to the page dictionary. You can add a /PieceInfo entry (or your custom entry) to this page dictionary and then later retrieve it.

Take a look at the CustomPageDictKeyMerge:

public void createPdf(String filename) throws IOException, DocumentException {
    PdfName marker = new PdfName("ITXT_PageMarker");
    List<PdfReader> readers = new ArrayList<PdfReader>();
    readers.add(new PdfReader(SRC1));
    readers.add(new PdfReader(SRC2));
    readers.add(new PdfReader(SRC3));
    Document document = new Document();
    PdfCopy copy = new PdfCopy(document, new FileOutputStream(filename));
    document.open();
    int counter = 0;
    int n;
    PdfImportedPage importedPage;
    PdfDictionary pageDict;
    for (PdfReader reader : readers) {
        counter++;
        n = reader.getNumberOfPages();
        for (int p = 1; p <= n; p++) {
            pageDict = reader.getPageN(p);
            pageDict.put(marker, new PdfString(String.format("Page %s of document %s", p, counter)));
            importedPage = copy.getImportedPage(reader, p);
            copy.addPage(importedPage);
        }
    }
    // close the document
    document.close();
    for (PdfReader reader : readers) {
        reader.close();
    }
}

In this example, we add a special marker to the page dictionary before we import the page. As a result, this marker will be added to the merged document:

Take a look at the CustomPageDictKeyCreate example to find out how to retrieve these custom markers:

public void check(String filename) throws IOException {
    PdfReader reader = new PdfReader(filename);
    PdfDictionary pagedict;
    for (int i = 1; i < reader.getNumberOfPages(); i++) {
        pagedict = reader.getPageN(i);
        System.out.println(pagedict.get(new PdfName("ITXT_PageMarker")));
    }
    reader.close();
}

Please make sure that you use a second class name for your custom key. iText has registered the prefix ITXT with ISO for its custom second class keys. This prefix makes sure that different companies don't use the same key for different purposes. All keys starting with ITXT can easily be identified as keys created by iText Group. ISO keeps track of all these prefixes to avoid duplicates. Registration of a prefix with ISO is free of charge.

这篇关于使用ITextSharp在合并文档中插入PieceInfo的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆