拆分巨大的40000页的PDF成单页,iTextSharp的,OutOfMemoryException异常 [英] split huge 40000 page pdf into single pages, itextsharp, outofmemoryexception

查看:353
本文介绍了拆分巨大的40000页的PDF成单页,iTextSharp的,OutOfMemoryException异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我越来越庞大的PDF文件,大量数据。目前的PDF是350 MB,拥有大约40000页。这当然会是很好的得到较小的PDF文件,但这是我有现在的工作: - (

I am getting huge PDF files with lots of data. The current PDF is 350 MB and has about 40000 pages. It would of course have been nice to get smaller PDFs, but this is what I have to work with now :-(

我可以在Acrobat Reader与一些延迟加载,当打开但Acrobat Reader软件后快。

I can open it in acrobat reader with some delay when loading but after that acrobat reader is quick.

现在我需要分割庞大的文件转换成单页,然后尝试从PDF页面看了一些收件人数据,然后发送每个收件人应该得到每个特定收件人一两页。

Now I need to split the huge file into single pages, then try to read some recipient data from the pdf pages, and then send the one or two pages that each recipient should get to each particular recipient.

下面是我很小code到目前为止使用iTextSharp的:

Here is my very small code so far using itextsharp:

var inFileName = @"huge350MB40000pages.pdf";
PdfReader reader = new PdfReader(inFileName);
var nbrPages = reader.NumberOfPages;
reader.Close();

什么情况是它涉及到第二行的新PdfReader,然后在那里停留了大约10分钟,过程中获取到约1.7 GB的大小,然后我得到一个OutOfMemoryException。

What happens is it comes to the second line "new PdfReader" then stays there for perhaps 10 minutes, the process gets to about 1.7 GB in size, and then I get an OutOfMemoryException.

我觉得新PdfReader尝试读取整个PDF到内存中。

I think the "new PdfReader" attempts to read the entire PDF into memory.

有一些其他/更好的方式来做到这一点?
例如,可不知何故,我只读了PDF文件的一部分到内存中,而不是它的所有在一次?
难道更好的工作使用一些其他的库比iTextSharp的?

Is there some other/better way to do this? For example, can I somehow read only a part of a PDF file into memory instead of all of it at once? Could it work better using some other library than itextsharp?

推荐答案

从我已阅读,它看起来实例,你应该使用接受一个RandomAccessFileOrArray对象的构造的PdfReader时等。免责声明:我没有尝试过这一点,我自己

From what I have read, it looks like when instantiating the PdfReader that you should use the constructor that takes in a RandomAccessFileOrArray object. Disclaimer: I have not tried this out myself.

iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(new iTextSharp.text.pdf.RandomAccessFileOrArray(@"C:\PDFFile.pdf"), null);

这篇关于拆分巨大的40000页的PDF成单页,iTextSharp的,OutOfMemoryException异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆