从pdf提取页面作为jpeg [英] Extract a page from a pdf as a jpeg

查看:70
本文介绍了从pdf提取页面作为jpeg的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在python代码中,如何有效地将pdf中的某个页面另存为jpeg文件? (用例:我有一个python flask网络服务器,将在其中上传pdf-s,并存储与每个页面相对应的jpeg-s.)

In python code, how to efficiently save a certain page in a pdf as a jpeg file? (Use case: I've a python flask web server where pdf-s will be uploaded and jpeg-s corresponding to each page is stores.)

此解决方案已关闭,但问题在于它无法将整个页面转换为jpeg.

This solution is close, but the problem is that it does not convert the entire page to jpeg.

推荐答案

可以使用pdf2image库.

The pdf2image library can be used.

您可以简单地使用

pip install pdf2image

安装后,您可以使用以下代码获取图像.

Once installed you can use following code to get images.

from pdf2image import convert_from_path
pages = convert_from_path('pdf_file', 500)

以jpeg格式保存页面

Saving pages in jpeg format

for page in pages:
    page.save('out.jpg', 'JPEG')


Github存储库 pdf2image 还提到它使用pdftoppm并且需要其他安装:


the Github repo pdf2image also mentions that it uses pdftoppm and that it requires other installations:

pdftoppm是执行实际魔术的软件.它作为名为 poppler 的较大软件包的一部分进行分发. Windows用户必须为Windows安装 poppler . Mac用户必须安装 poppler for Mac . 如果不是Linux用户,则将在发行版中预装pdftoppm(已在Ubuntu和Archlinux上进行了测试),请运行sudo apt install poppler-utils.

pdftoppm is the piece of software that does the actual magic. It is distributed as part of a greater package called poppler. Windows users will have to install poppler for Windows. Mac users will have to install poppler for Mac. Linux users will have pdftoppm pre-installed with the distro (Tested on Ubuntu and Archlinux) if it's not, run sudo apt install poppler-utils.

您可以通过以下方式使用anaconda在Windows下安装最新版本:

You can install the latest version under Windows using anaconda by doing:

conda install -c conda-forge poppler

注意: http://blog.alivate.com上提供的Windows版本最高为0.67. au/poppler-windows/,但请注意,0.68是于2018年8月发布,因此您不会获得最新功能或错误修复.

note: Windows versions upto 0.67 are available at http://blog.alivate.com.au/poppler-windows/ but note that 0.68 was released in Aug 2018 so you'll not be getting the latest features or bug fixes.

这篇关于从pdf提取页面作为jpeg的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆