如何打开和阅读PDF? [英] How to open PDF and read it?
问题描述
如何打开PDF文件并使用Python读取其中的某些内容(首选该语言,但是也可以使用Ruby,Perl或PHP)(以防被识别(不仅仅是图像))或报告该问题没有OCR是不可能的吗? TIA
how can I open a PDF file and read some of it's contents with Python (this language is preferred, however Ruby, Perl or PHP are fine too) (in case it is recognized (not just an image)) or report that it's impossible without OCR? TIA
更新:感谢您的解决方案,我敢肯定其中的一些适合我.
Update: thanks for the solutions, I'm sure some of them will suit me fine.
@RichH,我有一个pdf文件,不知道它是基于图像还是基于文本.我正在寻找一种工具来帮助我发现问题,以防万一它是基于文本的,请提取其中的一些内容.
@RichH, I have a pdf file, and don't know whether it is image- or text-based. I'm looking for a tool to help me find that out and in case it's text-based extract some of it's contents.
推荐答案
对于Perl,请查看以下模块:
For Perl, check out these modules:
- PDF::API2
- CAM::PDF
这篇关于如何打开和阅读PDF?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!