使用Java的Pdf页数 [英] Page count of Pdf with Java

查看:440
本文介绍了使用Java的Pdf页数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前我正在使用itext来读取pdf的页数。这需要很长时间,因为lib似乎扫描整个文件。

at the moment I am using itext to read the page count of a pdf. This takes quite long because the lib seems to scan the whole file.

页面信息是pdf标题中的某个位置,还是需要完整的文件?

Is the page information somewhere in the header of the pdf, or is a full filescan needed?

推荐答案

这是正确的。 iText在打开时会解析相当多的PDF(它不会读取流对象的内容,但就是这样)...

That's correct. iText parses quite a bit of a PDF when it is opened (it doesn't read the contents of stream objects, but that's about it)...

除非您使用 PdfReader(RandomAccessFileOrArray)构造函数,在这种情况下,它只会读取外部参照(通常是必需的),但在您开始请求特定对象(直接或通过各种调用)之前不会解析任何内容)。

UNLESS you use the PdfReader(RandomAccessFileOrArray) constructor, in which case it will only read the xrefs (mostly required), but not parse anything until you start requesting specific objects (directly or via various calls).


我编写的第一个PDF程序就是这样做的。它打开了PDF并完成了所需的最少工作量,读取了页数。它甚至没有解析它没有的外部参照。多年来没有考虑过这个项目...

The first PDF program I ever wrote did exactly this. It opened up a PDF and doing the bare minimum amount of work necessary, read the number of pages. It didn't even parse the xrefs it didn't have to. Haven't thought about that program in years...

所以虽然不是很有效,但它会非常更多有效使用RandomAccessFileOrArray:

So while not perfectly efficient, it'll be vastly more efficient to use a RandomAccessFileOrArray:

int efficientPDFPageCount(String path) {
  RandomAccessFileOrArray file = new RandomAccessFileOrArray(path, false, true );
  PdfReader reader = new PdfReader(file);
  int ret = reader.getNumberOfPages();
  reader.close();
  return ret;
}

更新:

对itext API进行了一次小修。现在(在5.4.x版本中)使用它的正确方法是通过java.io.RandomAccessFile:

The itext API underwent a little overhaul. Now (in version 5.4.x) the correct way to use it is to pass through java.io.RandomAccessFile:

int efficientPDFPageCount(File file) {
     RandomAccessFile raf = new RandomAccessFile(file, "r");
     RandomAccessFileOrArray pdfFile = new RandomAccessFileOrArray(
          new RandomAccessSourceFactory().createSource(raf));
     PdfReader reader = new PdfReader(pdfFile, new byte[0]);
     int pages = reader.getNumberOfPages();
     reader.close();
     return pages;
  }

这篇关于使用Java的Pdf页数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆