将pdf转换为图像,但放大后 [英] converting pdf to image but after zooming in

查看:116
本文介绍了将pdf转换为图像,但放大后的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

链接显示了pdf s可以转换为图像.有没有办法在转换为图像之前缩放pdf?在我的项目中,我将pdf s转换为png s,然后使用Python-tesseract库提取文本.我注意到,如果我缩放pdf s并将零件另存为png s,那么OCR会提供更好的结果.那么有没有办法在转换为png之前先缩放pdf?

This link shows how pdfs could be converted to images. Is there a way to zoom my pdfs before converting to images? In my project, i am converting pdfs to pngs and then using Python-tesseract library to extract text. I noticed that if I zoom pdfs and then save parts as pngs then OCR provides much better results. So is there a way to zoom pdfs before converting to pngs?

推荐答案

我认为提高图像的质量(分辨率)比放大pdf更好.

I think that raising the quality (resolution) of your image is a better solution than zooming into the pdf.

使用pdf2image您可以轻松完成此操作:

using pdf2image you can accomplish this quite easily:

安装pdf2image:pip install pdf2image

install pdf2image: pip install pdf2image

然后在python中将您的pdf转换为高质量的图像:

then, in python, convert your pdf into a high quality image:

from pdf2image import convert_from_path

pages = convert_from_path('sample.pdf', 400) #400 is the Image quality in DPI (default 200)

pages[0].save("sample.png")

通过使用quality参数,您应该得到想要的结果

by playing around with the quality parameter you should get the result you desider

这篇关于将pdf转换为图像,但放大后的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆