iText 2.1.7 PdfCopy.addPage(page)找不到页面引用? [英] iText 2.1.7 PdfCopy.addPage(page) can't find page reference?

查看:99
本文介绍了iText 2.1.7 PdfCopy.addPage(page)找不到页面引用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在维护一个使用iText 2.1.7创建PDF的Web应用程序.我想获取现有PDF的内容,并将其放入代码在创建过程中的pdf文档中.我有以下内容(更完整的代码):

package itexttest;

import com.lowagie.text.Document;
import com.lowagie.text.PageSize;
import com.lowagie.text.Paragraph;
import com.lowagie.text.pdf.PdfCopy;
import com.lowagie.text.pdf.PdfImportedPage;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfWriter;
import java.io.ByteArrayOutputStream;
import java.io.OutputStream;

public class ITextTest 
{
    public static void main(String[] args) 
    {
        try
        {
            ByteArrayOutputStream os = new ByteArrayOutputStream();
            Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
            PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
            bigDoc.open();

            Paragraph par = new Paragraph("one");
            bigDoc.add(par);
            bigDoc.add(new Paragraph("three"));

            addPdfPage(bigDoc, os, "c:/insertable.pdf");

            bigDoc.close();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

    private static void addPdfPage(Document document, OutputStream outputStream, String location) {
        try {

            PdfReader pdfReader = new PdfReader(location);
            int pages = pdfReader.getNumberOfPages();

            PdfCopy pdfCopy = new PdfCopy(document, outputStream);
            PdfImportedPage page = pdfCopy.getImportedPage(pdfReader, 1);
            pdfCopy.addPage(page);
        }
        catch (Exception e) {
            System.out.println("Cannot add PDF from PSC: <" + location + ">: " + e.getMessage());
            e.printStackTrace();
        }
    }

}

这将引发错误,PdfWriter.getPageReference()为空.

我如何不正确地使用它?如何从现有文档中获取页面并将其放入当前文档中?请注意,我根本不方便将文件作为临时存储或其他任何方式写入.

解决方案

我不再积极使用旧的iText版本,但是从那以后,有些事情没有改变.因此,这里的代码和指针中的一些问题有助于解决这些问题:

您当前代码中的主要问题是您

  • Document实例(您已经用于PdfWriter并且已经打开)重用于PdfCopy;尽管Document可以支持多个侦听器,但是它们都需要在调用open之前进行注册;这种构造的用例是以两种不同的格式并行创建同一文档;还有你

  • PdfWriterPdfCopy使用相同的输出流;结果不是一个有效的PDF,而是两个不同PDF的字节范围疯狂地混合在一起,即某些绝对不是有效的PDF.

正确使用PdfCopy

您可以通过首先在ByteArrayOutputStream中创建一个包含新段落的新PDF(关闭相关的Document),然后复制此PDF以及要添加到新PDF中的其他页面,来重构代码./p>

例如像这样:

ByteArrayOutputStream os = new ByteArrayOutputStream();
Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
bigDoc.open();
Paragraph par = new Paragraph("one");
bigDoc.add(par);
bigDoc.add(new Paragraph("three"));
bigDoc.close();

ByteArrayOutputStream os2 = new ByteArrayOutputStream();
Document finalDoc = new Document();
PdfCopy copy = new PdfCopy(finalDoc, new FileOutputStream(RESULT2));
finalDoc.open();
PdfReader reader = new PdfReader(os.toByteArray());
for (int i = 0; i < reader.getNumberOfPages();) {
    copy.addPage(copy.getImportedPage(reader, ++i));
}
PdfReader pdfReader = new PdfReader("c:/insertable.pdf");
copy.addPage(copy.getImportedPage(pdfReader, 1));
finalDoc.close();
reader.close();
pdfReader.close();

// result PDF
byte[] result = os2.toByteArray();           

仅使用PdfWriter

您也可以通过将页面直接导入到PdfWriter中来更改代码,例如像这样:

ByteArrayOutputStream os = new ByteArrayOutputStream();
Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
bigDoc.open();
Paragraph par = new Paragraph("one");
bigDoc.add(par);
bigDoc.add(new Paragraph("three"));

PdfReader pdfReader = new PdfReader("c:/insertable.pdf");
PdfImportedPage page = writer.getImportedPage(pdfReader, 1);
bigDoc.newPage();
PdfContentByte canvas = writer.getDirectContent();
canvas.addTemplate(page, 1, 0, 0, 1, 0, 0);

bigDoc.close();
pdfReader.close();

// result PDF
byte[] result = os.toByteArray();           

此方法看起来更好,因为不需要中间的PDF.不幸的是,这种外观令人生畏,这种方法存在一些缺点.

在这里不是整个原始页面都被复制并添加到文档中,而是仅其内容流被用作模板的内容,然后被引用从实际的新文档页面.这尤其意味着:

  • 如果导入的页面的尺寸与新目标文档的尺寸不同,则可能会切掉其中的某些部分,而新页面的某些部分将保持空白.因此,您经常会发现上面的代码变体,通过缩放和旋转尝试使导入的页面和目标页面合适.

  • 原始页面的内容现在位于从新页面引用的模板中.如果使用相同的机制将此新页面导入另一个文档,则会得到一个引用模板的页面,该模板又仅引用具有原始内容的模板.如果将此页面导入另一个文档,则将获得另一个间接级别.等等.

    不幸的是,合格的PDF查看者仅需要在一定程度上支持这种间接性.如果继续此过程,页面内容可能突然不再可见.如果原始页面已经具有其自身的引用模板层次结构,则可能会发生此事,而不是稍后.

  • 由于仅复制内容,因此不在内容流中的原始页面的属性将丢失.这尤其涉及到注释,例如表单字段或某些类型的突出显示标记,甚至某些类型的自由文本.

(顺便说一下,通用PDF规范术语中的这些模板被称为 Form XObjects .)

此答案明确涉及在合并PDF的情况下PdfCopyPdfWriter的使用./p>

I'm maintaining a web application that uses iText 2.1.7 to create PDFs. I want to take the content of an existing PDF and put it into the pdf document that the code is in the middle of creating. I have the following (EDIT: more complete code):

package itexttest;

import com.lowagie.text.Document;
import com.lowagie.text.PageSize;
import com.lowagie.text.Paragraph;
import com.lowagie.text.pdf.PdfCopy;
import com.lowagie.text.pdf.PdfImportedPage;
import com.lowagie.text.pdf.PdfReader;
import com.lowagie.text.pdf.PdfWriter;
import java.io.ByteArrayOutputStream;
import java.io.OutputStream;

public class ITextTest 
{
    public static void main(String[] args) 
    {
        try
        {
            ByteArrayOutputStream os = new ByteArrayOutputStream();
            Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
            PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
            bigDoc.open();

            Paragraph par = new Paragraph("one");
            bigDoc.add(par);
            bigDoc.add(new Paragraph("three"));

            addPdfPage(bigDoc, os, "c:/insertable.pdf");

            bigDoc.close();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

    private static void addPdfPage(Document document, OutputStream outputStream, String location) {
        try {

            PdfReader pdfReader = new PdfReader(location);
            int pages = pdfReader.getNumberOfPages();

            PdfCopy pdfCopy = new PdfCopy(document, outputStream);
            PdfImportedPage page = pdfCopy.getImportedPage(pdfReader, 1);
            pdfCopy.addPage(page);
        }
        catch (Exception e) {
            System.out.println("Cannot add PDF from PSC: <" + location + ">: " + e.getMessage());
            e.printStackTrace();
        }
    }

}

This throws an error, null from PdfWriter.getPageReference().

How am I using this incorrectly? How can I get a page from the existing document and put it in the current document? Notice that I am not in a place where it is at all convenient to write to files as temp storage or whatever.

解决方案

I'm not actively working with the old iText versions anymore but some things have not changed since then. Thus, here some issues in your code and pointers helping to resolve them:

Your main issues in your current code are that you

  • reuse the Document instance (which you already use for your PdfWriter and already have opened) for a PdfCopy; while a Document can support multiple listeners, they all need to be registered before calling open; the use case of this construct is to create the same document in parallel in two different formats; and you

  • use the same output stream for both your PdfWriter and your PdfCopy; the result is not one valid PDF but byte ranges from two different PDFs wildly mixed together, i.e. something that definitely won't be a valid PDF.

Using PdfCopy correctly

You can restructure your code by first creating a new PDF containing you new paragraphs in a ByteArrayOutputStream (closing the Document involved) and then copy this PDF and the other pages you want to add into a new PDF.

E.g. like this:

ByteArrayOutputStream os = new ByteArrayOutputStream();
Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
bigDoc.open();
Paragraph par = new Paragraph("one");
bigDoc.add(par);
bigDoc.add(new Paragraph("three"));
bigDoc.close();

ByteArrayOutputStream os2 = new ByteArrayOutputStream();
Document finalDoc = new Document();
PdfCopy copy = new PdfCopy(finalDoc, new FileOutputStream(RESULT2));
finalDoc.open();
PdfReader reader = new PdfReader(os.toByteArray());
for (int i = 0; i < reader.getNumberOfPages();) {
    copy.addPage(copy.getImportedPage(reader, ++i));
}
PdfReader pdfReader = new PdfReader("c:/insertable.pdf");
copy.addPage(copy.getImportedPage(pdfReader, 1));
finalDoc.close();
reader.close();
pdfReader.close();

// result PDF
byte[] result = os2.toByteArray();           

Using only PdfWriter

You can alternatively change your code by directly importing the page into your PdfWriter, e.g. like this:

ByteArrayOutputStream os = new ByteArrayOutputStream();
Document bigDoc = new Document(PageSize.LETTER, 50, 50, 110, 60);
PdfWriter writer = PdfWriter.getInstance(bigDoc, os);
bigDoc.open();
Paragraph par = new Paragraph("one");
bigDoc.add(par);
bigDoc.add(new Paragraph("three"));

PdfReader pdfReader = new PdfReader("c:/insertable.pdf");
PdfImportedPage page = writer.getImportedPage(pdfReader, 1);
bigDoc.newPage();
PdfContentByte canvas = writer.getDirectContent();
canvas.addTemplate(page, 1, 0, 0, 1, 0, 0);

bigDoc.close();
pdfReader.close();

// result PDF
byte[] result = os.toByteArray();           

This approach appears better because no intermediary PDF is required. Unfortunately this appearance is deceiving, this approach as some disadvantages.

Here not the whole original page is copied and added as is to the document but instead only its content stream is used as the content of a template which then is referenced from the actual new document page. This in particular means:

  • If the imported page has different dimensions than your new target document, some parts of it might be cut of while some parts of the new page remain empty. Because of this you will often find variants of the code above which by scaling and rotating try to make the imported page and target page fit.

  • The original page contents are now in a template which is referenced from the new page. If you import this new page into yet another document using the same mechanism, you get a page which references a template which again merely references a template which has the original contents. If you import this page into another document, you get another level of indirectness. Etc. etc..

    Unfortunately conforming PDF viewers only need to support this indirectness to a limited degree. If you continue this process, your page contents suddenly may not be visible anymore. If the original page already brings along its own hierarchy of referenced templates, this may happen sooner rather than later.

  • As only the contents are copied, properties of the original page not in the content stream will be lost. This in particular concerns annotations like form fields or certain types of highlight markings or even certain types of free text.

(By the way, these templates in generic PDF specification lingo are called Form XObjects.)

This answer explicitly deals with the use of PdfCopy and PdfWriter in the context of merging PDFs.

这篇关于iText 2.1.7 PdfCopy.addPage(page)找不到页面引用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆