/Rotate中的pyPdf IndirectObject [英] pyPdf IndirectObject in /Rotate
问题描述
我们有一个简单的脚本,可以读取传入的PDF文件.如果为风景,则将其旋转为人像,以供其他程序以后使用. pyPdf一切运行良好,直到我遇到一个文件,该文件以IndirectObject作为页面上/Rotate键的值.该对象是可解析的,因此我可以知道/Rotate值是什么,但是当尝试rotateClockwise或rotateCounterClockwise时,我得到了回溯,因为pyPdf不在/Rotate中期望IndirectObject.我已经做了很多尝试,尝试用该值覆盖IndirectObject,但是我什么都没得到.我什至尝试将相同的IndirectObject传递给rotateClockwise,并且它引发相同的回溯,这在pdf.pyc中更早
We have a simple script that reads incoming PDF files. If landscape it rotates it to Portrait for later consumption by another program. All was running well with pyPdf until I ran into a file with an IndirectObject as the value for the /Rotate key on the page. The Object is resolvable so I can tell what the /Rotate value is but when attempting to rotateClockwise or rotateCounterClockwise I get a traceback because pyPdf isn't expecting an IndirectObject in /Rotate. I've done quite a bit of playing around with the file trying to override the IndirectObject with the value but I haven't gotten anywhere. I even tried passing the same IndirectObject to rotateClockwise and it throws the same traceback, a line earlier in pdf.pyc
我的问题简直是. . .是否有针对pyPdf或PyPDF2的补丁程序,使它不会在这种设置中cho死,或者有一种我可以旋转页面的不同方法,或者有一种我还未曾见过/考虑过的库?我试过了PyPDF2,它有同样的问题.我已经将PDFMiner视为替代品,但它似乎更适合从PDF文件中获取信息,而不是操作它们.这是我在ipython中使用pyPDF播放文件的输出,PyPDF2的输出非常相似,但是信息的某些格式略有不同:
My question put simply is . . . is there a patch for pyPdf or PyPDF2 that makes it not choke on this kind of setup, or a different way I can go about rotating the page, or a different library that I haven't seen / considered yet? I've tried PyPDF2 and it has the same issue. I have looked at PDFMiner as a replacement but it seems to be more geared toward getting info out of PDF files rather than manipulating them. Here's the output from me playing with the file with pyPDF in ipython, the output for PyPDF2 was very similar but some of the formatting of the info was slightly different:
In [1]: from pyPdf import PdfFileReader
In [2]: mypdf = PdfFileReader(open("RP121613.pdf","rb"))
In [3]: mypdf.getNumPages()
Out[3]: 1
In [4]: mypdf.resolvedObjects
Out[4]:
{0: {1: {'/Pages': IndirectObject(2, 0), '/Type': '/Catalog'},
2: {'/Count': 1, '/Kids': [IndirectObject(4, 0)], '/Type': '/Pages'},
4: {'/Count': 1,
'/Kids': [IndirectObject(5, 0)],
'/Parent': IndirectObject(2, 0),
'/Type': '/Pages'},
5: {'/Contents': IndirectObject(6, 0),
'/MediaBox': [0, 0, 612, 792],
'/Parent': IndirectObject(4, 0),
'/Resources': IndirectObject(7, 0),
'/Rotate': IndirectObject(8, 0),
'/Type': '/Page'}}}
In [5]: mypage = mypdf.getPage(0)
In [6]: myrotation = mypage.get("/Rotate")
In [7]: myrotation
Out[7]: IndirectObject(8, 0)
In [8]: mypdf.getObject(myrotation)
Out[8]: 0
In [9]: mypage.rotateCounterClockwise(90)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/root/<ipython console> in <module>()
/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle)
1049 def rotateCounterClockwise(self, angle):
1050 assert angle % 90 == 0
-> 1051 self._rotate(-angle)
1052 return self
1053
/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle)
1054 def _rotate(self, angle):
1055 currentAngle = self.get("/Rotate", 0)
-> 1056 self[NameObject("/Rotate")] = NumberObject(currentAngle + angle)
1057
1058 def _mergeResources(res1, res2, resource):
TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int'
In [10]: mypage.rotateClockwise(90)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/root/<ipython console> in <module>()
/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateClockwise(self, angle)
1039 def rotateClockwise(self, angle):
1040 assert angle % 90 == 0
-> 1041 self._rotate(angle)
1042 return self
1043
/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle)
1054 def _rotate(self, angle):
1055 currentAngle = self.get("/Rotate", 0)
-> 1056 self[NameObject("/Rotate")] = NumberObject(currentAngle + angle)
1057
1058 def _mergeResources(res1, res2, resource):
TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int'
In [11]: mypage.rotateCounterClockwise(myrotation)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/root/<ipython console> in <module>()
/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle)
1048 # @param angle Angle to rotate the page. Must be an increment of 90 deg.
1049 def rotateCounterClockwise(self, angle):
-> 1050 assert angle % 90 == 0
1051 self._rotate(-angle)
1052 return self
TypeError: unsupported operand type(s) for %: 'IndirectObject' and 'int'
如果有人想深入研究它,我将很乐意提供我正在使用的文件.
I'll gladly supply the file I'm working with if someone wants to take an in-depth look at it.
推荐答案
您需要将getObject应用于IndirectObject的实例,因此在您的情况下应该是
You need to apply getObject to an instance of IndirectObject, so in your case it should be
myrotation.getObject()
这篇关于/Rotate中的pyPdf IndirectObject的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!