如何使用iTextSharp组合多个PDF文件,不包括分页符? [英] How can I combine multiple PDF files excluding page breaks using iTextSharp?

查看:116
本文介绍了如何使用iTextSharp组合多个PDF文件,不包括分页符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有人使用iTextSharp完成了这项工作,但我想将多个PDF文件合并为一个,但请将页面分开。例如,我想创建4个PDF文件,每个文件包含3行文本,所以我希望生成的文件在1页中包含所有12行。这可能吗?

I wonder if anyone has done this with iTextSharp, but I would like to combine multiple PDF files into one but leave the page breaks out. For example, I would like to create 4 PDF files containing 3 lines of text each, so I want the resulting file to have all 12 lines in 1 page. Is this possible?

推荐答案

由于OP也用[iText]标记了这个问题,我更喜欢在家里Java比.Net,这里是iText / Java的答案。它应该很容易翻译成iTextSharp / C#。


我想将多个PDF文件合并为一个,但让页面突然出现。例如,我想创建4个PDF文件,每个文件包含3行文本,所以我希望生成的文件在1页中包含所有12行。

I would like to combine multiple PDF files into one but leave the page breaks out. For example, I would like to create 4 PDF files containing 3 lines of text each, so I want the resulting file to have all 12 lines in 1 page.

对于PDF文件,如该示例所示,您可以使用此简单实用程序类:

For PDF files as indicated in that example you can use this simple utility class:

public class PdfDenseMergeTool
{
    public PdfDenseMergeTool(Rectangle size, float top, float bottom, float gap)
    {
        this.pageSize = size;
        this.topMargin = top;
        this.bottomMargin = bottom;
        this.gap = gap;
    }

    public void merge(OutputStream outputStream, Iterable<PdfReader> inputs) throws DocumentException, IOException
    {
        try
        {
            openDocument(outputStream);
            for (PdfReader reader: inputs)
            {
                merge(reader);
            }
        }
        finally
        {
            closeDocument();
        }

    }

    void openDocument(OutputStream outputStream) throws DocumentException
    {
        final Document document = new Document(pageSize, 36, 36, topMargin, bottomMargin);
        final PdfWriter writer = PdfWriter.getInstance(document, outputStream);
        document.open();
        this.document = document;
        this.writer = writer;
        newPage();
    }

    void closeDocument()
    {
        try
        {
            document.close();
        }
        finally
        {
            this.document = null;
            this.writer = null;
            this.yPosition = 0;
        }
    }

    void newPage()
    {
        document.newPage();
        yPosition = pageSize.getTop(topMargin);
    }

    void merge(PdfReader reader) throws IOException
    {
        PdfReaderContentParser parser = new PdfReaderContentParser(reader);
        for (int page = 1; page <= reader.getNumberOfPages(); page++)
        {
            merge(reader, parser, page);
        }
    }

    void merge(PdfReader reader, PdfReaderContentParser parser, int page) throws IOException
    {
        TextMarginFinder finder = parser.processContent(page, new TextMarginFinder());
        Rectangle pageSizeToImport = reader.getPageSize(page);
        float heightToImport = finder.getHeight();
        float maxHeight = pageSize.getHeight() - topMargin - bottomMargin;
        if (heightToImport > maxHeight)
        {
            throw new IllegalArgumentException(String.format("Page %s content too large; height: %s, limit: %s.", page, heightToImport, maxHeight));
        }

        if (heightToImport > yPosition - pageSize.getBottom(bottomMargin))
        {
            newPage();
        }
        else if (!writer.isPageEmpty())
        {
            heightToImport += gap;
        }
        yPosition -= heightToImport;

        PdfImportedPage importedPage = writer.getImportedPage(reader, page);
        writer.getDirectContent().addTemplate(importedPage, 0, yPosition - (finder.getLly() - pageSizeToImport.getBottom()));
    }

    Document document = null;
    PdfWriter writer = null;
    float yPosition = 0; 

    final Rectangle pageSize;
    final float topMargin;
    final float bottomMargin;
    final float gap;
}

如果您有 PdfReader instances 输入,您可以将它们合并到 OutputStream输出

If you have a list of PdfReader instances inputs, you can merge them like this into an OutputStream output:

PdfDenseMergeTool tool = new PdfDenseMergeTool(PageSize.A4, 18, 18, 5);
tool.merge(output, inputs);

使用A4页面大小创建合并文档,顶部和底部边距为18/72每个和不同PDF页面的内容之间的差距5/72。

This creates a merged document using an A4 page size, a top and bottom margin of 18/72" each and a gap between contents of different PDF pages of 5/72".

iText TextMarginFinder (在上面的 PdfDenseMergeTool 中使用)仅考虑文本。如果还要考虑其他内容类型,则必须稍微扩展此类。

The iText TextMarginFinder (used in the PdfDenseMergeTool above) only considers text. If other content types also are to be considered, this class has to be extended somewhat.


每个PDF只有几行,也许是表格或图像,但我希望最终结果在一个页面中。

Each PDF has just a few lines, perhaps a table or an image, but I want the end result in one page.

如果表格中的装饰品达到文本内容的上方或下方(例如线条或彩色背景),您应该使用更大的间隙值。不幸的是, TextMarginFinder 使用的解析框架不会将矢量图形命令转发到取景器。

If the tables contain decorations reaching above or below the text content (e.g. lines or colored backgrounds), you should use a larger gap value. Unfortunately the parsing framework used by the TextMarginFinder does not forward vector graphics commands to the finder.

如果图像是位图图像, TextMarginFinder 应该通过实现其 renderImage 方法来扩展,以便将图像区域考虑在内。

If the images are bitmap images, the TextMarginFinder should be extended by implementing its renderImage method to take the image area into account, too.


此外,某些PDF可能包含字段,因此我还希望将这些字段保留在生成的合并PDF中。

Also, some of the PDFs may contain fields, so I'd like to keep those fields in the resulting combined PDF as well.

如果还要考虑AcroForm字段,则必须

If AcroForm fields are also to be considered, you have to


  1. 扩展由 TextMarginFinder 表示的矩形,以包含窗口小部件注释的可视化矩形,并且

  2. 扩展 PdfDenseMergeTool.merge(PdfReader,PdfReaderContentParser,int)还复制这些小部件注释的方法。

  1. extend the rectangle represented by the TextMarginFinder to also include the visualization rectangles of the widget annotations, and
  2. extend the PdfDenseMergeTool.merge(PdfReader, PdfReaderContentParser, int) method to also copy those widget annotations.



更新



我上面写的

Update

I wrote above

不幸的是, TextMarginFinder 使用的解析框架不会将矢量图形命令转发到取景器。

Unfortunately the parsing framework used by the TextMarginFinder does not forward vector graphics commands to the finder.

同时(在版本5.5.6中)解析框架已经扩展为转发矢量图形命令。

Meanwhile (in version 5.5.6) that parsing framework has been extended to also forward vector graphics commands.

如果你替换了行

TextMarginFinder finder = parser.processContent(page, new TextMarginFinder());

by

MarginFinder finder = parser.processContent(page, new MarginFinder());

使用 MarginFinder class 此答案的底部显示,所有内容都被考虑,而不仅仅是文本。

using the MarginFinder class presented at the bottom of this answer, all content is considered, not merely text.

这篇关于如何使用iTextSharp组合多个PDF文件,不包括分页符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆