如何编辑可搜索PDF的搜索文本? [英] How can I edit the search text of a searchable PDF?

查看:153
本文介绍了如何编辑可搜索PDF的搜索文本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我可以使用图书馆中的扫描仪,该扫描仪可以创建可搜索的PDF。这些PDF可以显示扫描文档的确切图像,但是当您尝试选择图像中包含文本的部分时,可以选择PDF中的一种隐藏文本。这样,您可以复制和粘贴文本或在扫描的文档中搜索文本。这非常有用。与原始扫描图像相比,这是一个了不起的改进。我的Mac上还有几个应用程序,可以从扫描的文档或原始图像中创建这种可搜索的PDF。

I have access to a scanner at my library which can create "searchable PDFs." These are PDFs that show the exact image of a scanned document, but there is a kind of hidden text in the PDF that can be selected when you try to select a portion of the image that contains text. In this way you can copy and paste text or search for text in the scanned document. This is VERY useful. It's an awesome improvement over raw scanned images. I also have several apps on my mac that can create this kind of searchable PDF from a scanned document or a raw image.

现在对于任何使用过OCR的人来说,这都是显而易见的图像转换为文本的过程并非100%准确,因此您搜索或复制的文本在某些地方将不正确。

Now it's obvious from any who has ever used OCR that the process of converting images to text is not 100% accurate, so the text that you search or copy will not be correct in some places.

需要一段时间来找到可以加载可搜索PDF并允许我修复隐藏的可搜索文本而无需重新格式化或修改原始扫描图像的应用程序。

So I search for quite some time to find an application that would load a searchable PDF and allow me to repair the hidden searchable text without reformatting or modifying the original scanned image.

有人知道吗

在这里值得一提的是,我尝试了最新版的Mac版Adobe Acrobat DC,但似乎并没有这样做。甚至允许我查看隐藏的可搜索文本,更不用说对其进行编辑了。它的确使我可以用自己的OCR处理结果替换扫描的图像,以便我可以编辑和保存文档。但是,这对于我使用的任何扫描文档都会产生可怕的结果。似乎是为编辑本机PDF而不是编辑扫描的文档而设计的。

It's worth saying here that I tried the latest version of Adobe Acrobat DC for Mac, and it doesn't seem to even allow me to view the hidden searchable text, much less edit it. It does allow me to replace scanned image with the results of it's own OCR process so that I could edit and save the document. But this would produce horrible results for any of the scanned documents that I am using. It seems designed for editing a "native PDF" not editing a scanned document.

我也尝试过ABBYY FineReader,但没有运气。

I have also tried ABBYY FineReader with no luck.

推荐答案

我正在使用ABBYY FineReader 12 Professional。 (不是开源的)
只需打开扫描的图像或扫描的pdf并按验证文本(或 Ctrl + F7 ),然后您便会遍历所有拼写错误或低自信的角色并修复它们。

i'm using ABBYY FineReader 12 Professional. (not open source) Just open a scanned image or scanned pdf and press Verify Text(or Ctrl + F7), than you go over all the spelling errors or low-confidence charachters and fix them.

该程序非常好,它向您显示要校正的图像/ pdf的确切位置,并为方便起见,OCR并排猜测。

The program is very good, it shows you the exact place in image/pdf to correct and the OCR guessing side by side for convenience. It iterates all of them.

[顺便说一句,我正在使用快捷方式来加快操作速度:
Alt + Enter 将无法识别的单词添加到字典中。
Ctrl + Delete 跳过单词或确认单词是否固定。]

[By the way, I'm using the shortcuts to speed up things: Alt+Enter to add the unrecognized word to dictionary. Ctrl+Delete to skip word or confirm in case you fixed it.]

将文档另存为pdf文件菜单:文件>另存为> PDF文件,您可以在每个pdf阅读器上进行搜索。保存的文件看起来与扫描的文件相同,但是后面有文本。

Than save the document as a pdf file Menu:File>Save Document As> PDF File, and you can search it on every pdf reader. The saved file look the same as the scanned one, but 'behind' it there text.

很奇怪,您没有运气就尝试了ABBYY ...对我来说很棒。

It's weird you tried ABBYY with no luck... it's working great for me. maybe you tried not the Professional version.

希望它对您有所帮助。

这篇关于如何编辑可搜索PDF的搜索文本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆