在Python中从PDF提取页面大小 [英] Extracting page sizes from PDF in Python

查看:957
本文介绍了在Python中从PDF提取页面大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想阅读PDF并获取一些页面列表以及每页的大小.我不需要以任何方式进行操作,只需阅读它即可.

I want to read a PDF and get some list of it's pages and each page's size. I don't need to manipulate it in any way, just read it.

当前正在尝试pyPdf,它可以执行我需要的所有操作,除了获取页面大小的方法外.由于pdf文档中的页面大小可能会有所不同,因此我可能需要反复浏览.我还有其他可以使用的libray/方法吗?

Currently trying out pyPdf and it does everything I need except a way to get page sizes. Understanding that I will probably have to iterate through, as page sizes can vary in a pdf document. Is there another libray/method I can use?

我尝试使用PIL,一些在线食谱甚至使用d = Image(imagefilename),但它从不读取我的任何PDF-它读取我向其投掷的所有内容-甚至某些我不知道PIL可以做的事情

I tried using PIL, some online recipes even have d=Image(imagefilename) usage, but it NEVER reads any of my PDFs - it reads everything else I throw at it - even some things I didn't know PIL could do.

任何指导都值得赞赏-我使用的是Windows 7 64,python25(因为我也做GAE东西),但是我很乐意在Linux或更现代的pythiis中使用它.

Any guidance appreciated - I'm on windows 7 64, python25 (because I also do GAE stuff), but I'm happy to do it in Linux or more modern pythiis.

推荐答案

这可以通过 PyPDF2 :

>>> from PyPDF2 import PdfFileReader
>>> input1 = PdfFileReader(open('example.pdf', 'rb'))
>>> input1.getPage(0).mediaBox
RectangleObject([0, 0, 612, 792])

(以前称为 pyPdf 并仍参考其文档.)

(Formerly known as pyPdf and still refers to its documentation.)

这篇关于在Python中从PDF提取页面大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆