PdfPage.flush()的行为 [英] behavior of PdfPage.flush()

查看:89
本文介绍了PdfPage.flush()的行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

PdfPage.flush(true)到底能做什么? SmartMode(或任何其他设置)会影响行为吗?在许多情况下,我想让页面尽可能长时间地保持可编辑状态,因此请不要担心在document.close()之前将PDF文档组装在内存中.但是,当生成非常大的文件(成千上万页)时,内存变得越来越有限.我很天真地希望PdfPage.flush(true)将内容流写入磁盘并释放内存,但是调用flush(true)似乎只能将几个字节写入磁盘.

what exactly does PdfPage.flush(true) do? Does SmartMode (or any other setting) affect the behavior? For many cases, i want to leave the page editable for as long as possible, so never worried that the PDF document was assembled in memory until document.close(). But when generating very large files (tens of thousands of pages), memory is becoming constrained. I was naively hoping that PdfPage.flush(true) would write the content stream to disk and free up memory, but calling flush(true) only seems to write a couple of bytes to disk.

我想我的问题的更一般的版本是我们如何有效地将大量文档合并为一个非常大的文档?(itext7)",但是我对PDF规范本身并不精通.想更好地了解实际情况.

I guess the more general version of my question is "how do we efficiently merge lots of documents into a single, very-large document? (itext7)" but not being highly proficient w/ the PDF spec itself i'd also like to better understand what's actually going on.

推荐答案

flush(),当在布局对象上调用时,会强制这些对象及其子对象将其内容绘制(==写)到作者的输出流中.手动调用flush()时只看到几个字节被写入的原因是,默认的Document构造函数已经通过重载相关的构造函数来将iText设置为主动刷新:

flush(), when called on layout objects, forces those objects and their children to draw (== write) their contents to the writer's outputstream. The reason why you only see a couple of bytes being written when manually calling flush() is because the default Document constructors already set iText to flush aggressively by overloading the relevant constructors:

/**
 * Creates a document from a {@link PdfDocument} with a manually set {@link
 * PageSize}.
 *
 * @param pdfDoc   the in-memory representation of the PDF document
 * @param pageSize the page size
 */
public Document(PdfDocument pdfDoc, PageSize pageSize) {
    this(pdfDoc, pageSize, true);
}

/**
 * Creates a document from a {@link PdfDocument} with a manually set {@link
 * PageSize}.
 *
 * @param pdfDoc         the in-memory representation of the PDF document
 * @param pageSize       the page size
 * @param immediateFlush if true, write pages and page-related instructions
 *                       to the {@link PdfDocument} as soon as possible.
 */
public Document(PdfDocument pdfDoc, PageSize pageSize, boolean immediateFlush)

关于一般问题的建议: 确实没有某种iText函数或配置可以使整个进度神奇地更快,更高效,但是您可以在iText之外做一些技巧:

As for advice on the general question: There isn't really some sort of iText function or configuration that makes the entire progress magically faster and more efficient, but there are some tricks you can do outside of iText:

1)分配更多的资源,这些资源明显且通常不可行.

1) Allocate more resources, obvious and often not feasible.

2)执行多阶段批处理:在步骤X中将10个文件合并为1,然后在步骤X + 1中继续合并这些文件.通常,由于可能会重用字体和图像等资源,因此1个大文件将分别小于10个文件.

2) Do multi-stage batch processing: merge 10-files into 1 in step X, continue with merging those files in step X+1. In general, the 1 big-file will be smaller than the 10 files seperatly, because of the possible re-use of resources such as fonts and images.

3)在不需要的资源(例如晚上,午餐等)不需要的时间运行合并过程.

3) Run the merging process at times the resources it takes up aren't needed anywhere else, e.g., at night, over lunch etc.

至于为什么PdfPage#flush()只向内容流写入几个字节,这取决于输入文档,但是它很可能指向被刷新的页面,该页面要么具有文本内容,要么具有大量共享资源.只要页面包含之前已复制的资源,SmartMode就会限制写入页面刷新的输出流的数量.

As for why PdfPage#flush() only writes a couple of bytes to the contentstream, that depends on the input document, but it most likely points towards a page being flushed that either has mostly text content or a lot of shared resources. SmartMode should limit the amount written to the outputstream that a page flushes, as long as the page contains resources that have been copied before.

这篇关于PdfPage.flush()的行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆