ITextSharp中的PdfCopyForms导致堆栈溢出错误 [英] PdfCopyForms in ITextSharp causing a Stack Overflow error

查看:256
本文介绍了ITextSharp中的PdfCopyForms导致堆栈溢出错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这个方法中,我试图从一个PDF文档中获取输入字段,将它们粘贴到另一个文档上,然后将结果打印为pdf文件。结果将是一个新的PDF文件,其中包含第一个PDF的输入字段和第二个PDF的静态内容。

In this method I am trying to grab the input fields from one PDF document, paste them onto another document, and print out the result as a pdf file. The result would be a new PDF file which has the input fields of the first PDF and the static content of the second PDF.

我写了一些我认为可以执行此任务的代码,但每次执行copier.close()时我都遇到了StackOverflow错误。这是它抛出的错误:

I wrote some code that I thought would perform this task, but I ran into a StackOverflow error each time "copier.close()" is executed. This is the error that it throws:

An unhandled exception of type 'System.StackOverflowException' occurred in mscorlib.dll

这是代码:

public static void AddFormFieldsFromSource(string sourcePath, string secondSourcePath, string targetPath) {
  lock (syncLock) {

    PdfReader.unethicalreading = true;

    PdfReader readerMain = new PdfReader(sourcePath);

    FileStream stream = new FileStream(targetPath, FileMode.Create, FileAccess.Write);

    PdfCopyForms copier = new PdfCopyForms(stream);

    PdfReader secondSourceReader = new PdfReader(secondSourcePath);

    copier.AddDocument(secondSourceReader);

    copier.CopyDocumentFields(readerMain);



    copier.Close();
    secondSourceReader.Close();
  }
}

源路径是我从中获取输入字段的地方,我的第二个源路径是我从中获取静态内容的地方。

The sourcepath is where I get my input fields from, and my second source path is where I get my static content from.

我用于SourcePath变量的PDF位于:
https://www.dropbox.com/s/qcc6ug8oohqvmca/primarytwopages2.pdf

The PDF I used for the SourcePath variable is located here: https://www.dropbox.com/s/qcc6ug8oohqvmca/primarytwopages2.pdf

我使用的PDF对于secondSourcePath变量位于此处:
https://www.dropbox.com /s/kx2rlhmizh46hl7/secondarytwopages.pdf

The PDF I use for the secondSourcePath variable is located here: https://www.dropbox.com/s/kx2rlhmizh46hl7/secondarytwopages.pdf

另外,另一方面,我使用的是ITextSharp版本5.5.0。

Also, on another note, I am using ITextSharp version 5.5.0.

知道它为什么抛出StackOverflow错误?我不在我的代码中进行任何递归调用。我的第一个猜测是我试图错误地完成这项任务。另一种可能是ITextSharp可能存在错误。

Any idea why it is throwing the StackOverflow error? I don't make any recursive calls in my code. My first guess is that I am trying to do this task incorrectly. The other possibility is that perhaps ITextSharp has a bug.

更新:我将源代码下载到ITextSharp的最新版本(5.5.1),构建了一个dll所以我可以调试,然后在我的代码中引用该DLL。堆栈溢出错误似乎发生在此方法的类PdfIndirectReference中:

UPDATE: I downloaded the source code to the LATEST REVISION of ITextSharp (5.5.1), built a dll so I could debug, and then referenced that dll in my code. The stack overflow error appears to occur in the class PdfIndirectReference in this method:

public class PdfIndirectReference : PdfObject {
....
        internal PdfIndirectReference(int type, int number, int generation) : base(0, new StringBuilder().Append(number).Append(' ').Append(generation).Append(" R").ToString()) {
        this.number = number;
        this.generation = generation;
    }

在dll代码的调用堆栈中,我发现它以递归方式调用

In the call stack of the dll code, I found that it recursively calls a method over and over again in


itextsharp.text.pdf.PdfCopyFieldsImp.Propagate()中反复使用的方法。

itextsharp.text.pdf.PdfCopyFieldsImp.Propagate().

这必然是堆栈溢出发生的原因。

This must be why the stack overflow is occurring.

所以,它不会出现在我的代码中,而是出现在dll中。知道怎么解决这个问题吗?

So, it doesn't occur in my code, but rather the dll. Any idea how to get around this?

推荐答案

我使用iText和Java重现了这个问题;同样的问题出现在这里,所以很可能原因是相同的。

I reproduced the issue using iText and Java; the same issue occurs here, so quite likely the cause is the same.

PdfCopyForms 内部使用 PdfCopyFormsImp ,它派生自 PdfCopyFieldsImp 。后一类提供了繁重的字段和表单复制的基本方法,其中包括传播,当堆栈溢出发生时,OP在调用堆栈中多次找到它。

PdfCopyForms internally uses PdfCopyFormsImp which is derived from PdfCopyFieldsImp. This latter class provides the base methods doing the heavy lifting of field and form copying, among them propagate which the OP has found multiple times in the call stack when the stack overflow occurs.

与观察到的堆栈溢出留下的印象相反, PdfCopyFieldsImp 有一个通过标记已访问的对象来防止无限循环的机制:

Contrary to the impression left by the observed stack overflow, PdfCopyFieldsImp does have a mechanism to prevent endless loops by marking objects already visited:

/**
 * Sets a reference to "visited" in the copy process.
 * @param   ref the reference that needs to be set to "visited"
 * @return  true if the reference was set to visited
 */
protected boolean setVisited(PRIndirectReference ref) {
    IntHashtable refs = visited.get(ref.getReader());
    if (refs != null)
        return refs.put(ref.getNumber(), 1) != 0;
    else
        return false;
}

此方法同时标记来自某些<$ c $的对象引用c> PdfReader 访问并返回之前是否已访问过。

This method at the same time marks an object reference from some PdfReader as visited and returns whether or not it has been visited before.

至少它对所有<$ c的引用都这样做$ c> PdfReader 在访问映射中有条目的实例,来自 PdfReader 实例的引用没有这样的条目总是声称还没有被访问过( return false )。因此,在多次访问的情况下,来自后者的参考文献不被识别为访问过!

At least it does so for references from all PdfReader instances having an entry in the visited mapping, references from PdfReader instances without such an entry always are claimed to not have been visited yet (return false). Thus, references from those latter readers are not recognized as visited in case of multiple visits!

PdfReader 实例获得仅在一个代码位置的访问映射中的条目:只有使用 addDocument 添加到副本的读者才能获得它。

PdfReader instances get an entry in the visited mapping only in one code location: Only readers added to the copy using addDocument get it.

使用 PdfCopyForms 将表单字段从一个文档添加到其他PDF文件中,显然不 em>使用 addDocument 为具有要复制的表单的读者,而是 copyDocumentFields 。因此,循环防止在这里不起作用。

Using PdfCopyForms to add the form fields from one document to some other PDF, one obviously does not use addDocument for the reader with the form to copy but instead copyDocumentFields. Thus, loop prevention does not work here.

通过在已访问映射中添加一个条目,用于从中读取表单被复制,可以防止Stack Overflow。我在 PdfCopyFormsImp.copyDocumentFields

By adding an entry in the visited mapping for the reader from which the form is copied, one can prevent the Stack Overflow. I did it in PdfCopyFormsImp.copyDocumentFields

public void copyDocumentFields(PdfReader reader) throws DocumentException {
    if (!reader.isOpenedWithFullPermissions())
        throw new IllegalArgumentException(MessageLocalization.getComposedMessage("pdfreader.not.opened.with.owner.password"));
    if (readers2intrefs.containsKey(reader)) {
        reader = new PdfReader(reader);
    }
    else {
        if (reader.isTampered())
            throw new DocumentException(MessageLocalization.getComposedMessage("the.document.was.reused"));
        reader.consolidateNamedDestinations();
        reader.setTampered(true);
    }
    reader.shuffleSubsetNames();
    readers2intrefs.put(reader, new IntHashtable());

    visited.put(reader, new IntHashtable()); //<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

    fields.add(reader.getAcroFields());
    updateCalculationOrder(reader);
}

在iTextSharp中,类似的更改将在 PdfCopyFormsImp中。 CopyDocumentFields

In iTextSharp the analogous change would be in PdfCopyFormsImp.CopyDocumentFields:

    virtual public void CopyDocumentFields(PdfReader reader) {
        if (!reader.IsOpenedWithFullPermissions)
            throw new BadPasswordException(MessageLocalization.GetComposedMessage("pdfreader.not.opened.with.owner.password"));
        if (readers2intrefs.ContainsKey(reader)) {
            reader = new PdfReader(reader);
        }
        else {
            if (reader.Tampered)
                throw new DocumentException(MessageLocalization.GetComposedMessage("the.document.was.reused"));
            reader.ConsolidateNamedDestinations();
            reader.Tampered = true;
        }
        reader.ShuffleSubsetNames();
        readers2intrefs[reader] = new IntHashtable();

        visited[reader] =  new IntHashtable();  //<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

        fields.Add(reader.AcroFields);
        UpdateCalculationOrder(reader);
    }

免责声明:我没有检查过是否 PdfCopyForms 在此更改后完全按照要求运行。我只是用Java测试它,并且只观察到不再发生Stack Overflow,并且在OP的用例中得到的PDF看起来还不错。

Disclaimer: I have not checked whether PdfCopyForms works exactly as required after this change. I merely tested it in Java and only observed that no Stack Overflow occurs anymore and that the resulting PDF in the OP's use case looks ok.

这篇关于ITextSharp中的PdfCopyForms导致堆栈溢出错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆