ReportLab和pdfrw:导入扫描的PDF [英] ReportLab and pdfrw: Importing Scanned PDF

查看:209
本文介绍了ReportLab和pdfrw:导入扫描的PDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用下面的代码,我试图将pdf页面导入到现有的canvas对象中并保存为PDF.这通常可以很好地工作,但是我注意到当我尝试使用从扫描的文档生成的PDF时,它会导致空白页.有没有人?

Using the code below, I am trying to import a pdf page into an existing canvas object and save to PDF. This usually works just fine, but I noticed that when I try it with a PDF generated from a scanned document, it results in a blank page. Any takers?

from reportlab.pdfgen import canvas
from pdfrw import PdfReader
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

c = canvas.Canvas(Out_Folder+pdf_file_name)
c.setPageSize([11*inch, 8.5*inch])

page = PdfReader(folder+'2_VisionMissionValues.pdf',decompress=False).pages
p = pagexobj(page[0])
c.setPageSize([11*inch, 8.5*inch]) #Set page size (for landscape)
c.doForm(makerl(c, p))
c.showPage()
c.save()

提前谢谢!

推荐答案

很好...

一方面,我绝对不知道为什么会这样,并且 现在没有太多时间对其进行调试.

On the one hand, I have absolutely no idea why this is happening, and not really much time to debug it right now.

另一方面,我为您提供了一种解决方法(并且我尝试了 v0.3以及当前github master上的解决方法,它 在这两种情况下都对我有用.

On the other hand, I have a workaround for you (and I tried the workaround on v0.3, as well as on the current github master, and it worked in both cases for me).

我首先通过验证您的代码在您的页面上是否失败以及 它在另一个PDF上工作.然后我问自己:如果我使用 我的水印示例,用您的页面作为水印来创建PDF?" (因为它使用某些相同形式的XObject代码).那行得通, 所以我问自己:如果我通过我的 您的reportlab代码中带有水印的页面?"

I started off by verifying that your code failed on your page and that it worked on another PDF. Then I asked myself "What happens if I use my watermark example to create a PDF with your page as a watermark?" (because that uses some of the same form XObject code). That worked, so then I asked myself "What does it look like if I pass my watermarked page through your reportlab code?"

有趣的是,整个带水印的页面,包括您制作的图像 它通过.因此,我修改了您的代码,以完成 水印可以,最终将表单XObject放入表单中 XObject传递给reportlab时.那行得通.

Interestingly, the entire watermarked page, including your image made it through. So I modified your code to do the minimal stuff that the watermark does, which winds up putting a form XObject inside a form XObject when it's passed to reportlab. That worked.

这是我用于此目的的代码的略微修改版本.

Here's a slightly modified version of your code that I used for this.

import sys

from reportlab.pdfgen import canvas
from pdfrw import PdfReader, PageMerge
from pdfrw.buildxobj import pagexobj
from pdfrw.toreportlab import makerl

inch = 72

fname, = sys.argv[1:]
page = PdfReader(fname,decompress=False).pages[0]
p = pagexobj(PageMerge().add(page).render())

c = canvas.Canvas('outstuff.pdf')
c.setPageSize([8.5*inch, 11.0*inch]) #Set page size (for portrait)
c.doForm(makerl(c, p))
c.showPage()
c.save()

这篇关于ReportLab和pdfrw:导入扫描的PDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆