/Rotate中的pyPdf IndirectObject [英] pyPdf IndirectObject in /Rotate

查看:109
本文介绍了/Rotate中的pyPdf IndirectObject的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个简单的脚本,可以读取传入的PDF文件.如果为风景,则将其旋转为人像,以供其他程序以后使用. pyPdf一切运行良好,直到我遇到一个文件,该文件以IndirectObject作为页面上/Rotate键的值.该对象是可解析的,因此我可以知道/Rotate值是什么,但是当尝试rotateClockwise或rotateCounterClockwise时,我得到了回溯,因为pyPdf不在/Rotate中期望IndirectObject.我已经做了很多尝试,尝试用该值覆盖IndirectObject,但是我什么都没得到.我什至尝试将相同的IndirectObject传递给rotateClockwise,并且它引发相同的回溯,这在pdf.pyc中更早

We have a simple script that reads incoming PDF files. If landscape it rotates it to Portrait for later consumption by another program. All was running well with pyPdf until I ran into a file with an IndirectObject as the value for the /Rotate key on the page. The Object is resolvable so I can tell what the /Rotate value is but when attempting to rotateClockwise or rotateCounterClockwise I get a traceback because pyPdf isn't expecting an IndirectObject in /Rotate. I've done quite a bit of playing around with the file trying to override the IndirectObject with the value but I haven't gotten anywhere. I even tried passing the same IndirectObject to rotateClockwise and it throws the same traceback, a line earlier in pdf.pyc

我的问题简直是. . .是否有针对pyPdf或PyPDF2的补丁程序,使它不会在这种设置中cho死,或者有一种我可以旋转页面的不同方法,或者有一种我还未曾见过/考虑过的库?我试过了PyPDF2,它有同样的问题.我已经将PDFMiner视为替代品,但它似乎更适合从PDF文件中获取信息,而不是操作它们.这是我在ipython中使用pyPDF播放文件的输出,PyPDF2的输出非常相似,但是信息的某些格式略有不同:

My question put simply is . . . is there a patch for pyPdf or PyPDF2 that makes it not choke on this kind of setup, or a different way I can go about rotating the page, or a different library that I haven't seen / considered yet? I've tried PyPDF2 and it has the same issue. I have looked at PDFMiner as a replacement but it seems to be more geared toward getting info out of PDF files rather than manipulating them. Here's the output from me playing with the file with pyPDF in ipython, the output for PyPDF2 was very similar but some of the formatting of the info was slightly different:

In [1]: from pyPdf import PdfFileReader

In [2]: mypdf = PdfFileReader(open("RP121613.pdf","rb"))

In [3]: mypdf.getNumPages()
Out[3]: 1

In [4]: mypdf.resolvedObjects
Out[4]: 
{0: {1: {'/Pages': IndirectObject(2, 0), '/Type': '/Catalog'},
     2: {'/Count': 1, '/Kids': [IndirectObject(4, 0)], '/Type': '/Pages'},
     4: {'/Count': 1,
     '/Kids': [IndirectObject(5, 0)],
     '/Parent': IndirectObject(2, 0),
     '/Type': '/Pages'},
     5: {'/Contents': IndirectObject(6, 0),
     '/MediaBox': [0, 0, 612, 792],
     '/Parent': IndirectObject(4, 0),
     '/Resources': IndirectObject(7, 0),
     '/Rotate': IndirectObject(8, 0),
     '/Type': '/Page'}}}

In [5]: mypage = mypdf.getPage(0)

In [6]: myrotation = mypage.get("/Rotate")

In [7]: myrotation
Out[7]: IndirectObject(8, 0)

In [8]: mypdf.getObject(myrotation)
Out[8]: 0

In [9]: mypage.rotateCounterClockwise(90)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/root/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle)
   1049     def rotateCounterClockwise(self, angle):
   1050         assert angle % 90 == 0
-> 1051         self._rotate(-angle)
   1052         return self
   1053 

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle)
   1054     def _rotate(self, angle):
   1055         currentAngle = self.get("/Rotate", 0)
-> 1056         self[NameObject("/Rotate")] = NumberObject(currentAngle + angle)
   1057 
   1058     def _mergeResources(res1, res2, resource):

TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int'

In [10]: mypage.rotateClockwise(90)       
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/root/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateClockwise(self, angle)
   1039     def rotateClockwise(self, angle):
   1040         assert angle % 90 == 0
-> 1041         self._rotate(angle)
   1042         return self
   1043 

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle)
   1054     def _rotate(self, angle):
   1055         currentAngle = self.get("/Rotate", 0)
-> 1056         self[NameObject("/Rotate")] = NumberObject(currentAngle + angle)
   1057 
   1058     def _mergeResources(res1, res2, resource):

TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int'

In [11]: mypage.rotateCounterClockwise(myrotation)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/root/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle)
   1048     # @param angle Angle to rotate the page.  Must be an increment of 90 deg.

   1049     def rotateCounterClockwise(self, angle):
-> 1050         assert angle % 90 == 0
   1051         self._rotate(-angle)
   1052         return self

TypeError: unsupported operand type(s) for %: 'IndirectObject' and 'int'

如果有人想深入研究它,我将很乐意提供我正在使用的文件.

I'll gladly supply the file I'm working with if someone wants to take an in-depth look at it.

推荐答案

您需要将getObject应用于IndirectObject的实例,因此在您的情况下应该是

You need to apply getObject to an instance of IndirectObject, so in your case it should be

myrotation.getObject()

这篇关于/Rotate中的pyPdf IndirectObject的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆