使用Python将PDF转换为图像 [英] Convert PDF to Image using Python

查看:944
本文介绍了使用Python将PDF转换为图像的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在已安装的ubuntu服务器中为此将pdf文件转换为图像文件:

I am trying to convert a pdf file to image file for this in my ubuntu server i have installed:

  1. python2.7
  2. poppler-utils
  3. pdf2image == 1.12.1

我的代码:

from pdf2image import convert_from_path, convert_from_bytes

images = convert_from_path("/home/user/pdf_file.pdf")

# OR

with open("/home/user/pdf_file.pdf") as pdf:
    images = convert_from_bytes(pdf.read())

输出

当我使用函数"convert_from_path"

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

当我使用函数"convert_from_bytes"

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 268, in convert_from_bytes
    paths_only=paths_only,
  File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
    thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

我重新安装了所有实用程序,然后遇到了这些问题.

I have reinstalled all my utilities then i am facing these problems.

推荐答案

我也在python2中失败,但在python3中成功.

I failed in python2 too, but succeeded in python3.

另一个库也发生了相同的问题: TypeError:"threadsafe_iter"对象不是迭代器

There's a same issue happened on an other library: TypeError: 'threadsafe_iter' object is not an iterator

正如他们所说,这是python 2 vs 3的问题,由next()函数引起.
如果修改文件/home/***/.local/lib/python2.7/site-packages/pdf2image/generators.py中的__next__()-> next(),它将在py2中成功运行.

As they said, it's a python 2 vs 3 issue, caused by next() function.
If modify __next__() -> next() in file/home/***/.local/lib/python2.7/site-packages/pdf2image/generators.py , it will run successful in py2.

顺便说一句,我已经为pdf2image团队创建了一个新期刊.
TypeError:ThreadSafeGenerator对象不是迭代器#133

BTW, i have create a new issue to pdf2image team.
TypeError: ThreadSafeGenerator object is not an iterator #133

其他
pdf2image自述文件表示这是python(3.5+)模块.
pdf2image v1.7.1适用于py27.尝试通过pip install pdf2image==1.7.1

Additional
pdf2image readme said it's a python (3.5+) module.
pdf2image v1.7.1 work on py27. try it by pip install pdf2image==1.7.1

这篇关于使用Python将PDF转换为图像的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆