如何编写文本搜索和替换PDF文件 [英] How to program a text search and replace in PDF files

查看:106
本文介绍了如何编写文本搜索和替换PDF文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我如何能够以编程方式搜索和替换大量PDF文件中的某些文本?我想删除已添加到一组文件的URL。我已经能够在Adobe Pro的批处理下使用javascript删除链接,但链接文本仍然存在。我见过使用文本touchup的建议,手动工作,但我不想手动修改1300个文件。

How would I be able to programmatically search and replace some text in a large number of PDF files? I would like to remove a URL that has been added to a set of files. I have been able to remove the link using javascript under Batch Processing in Adobe Pro, but the link text remains. I have seen recommendations to use text touchup, which works manually, but I don't want to modify 1300 files manually.

推荐答案

由于文档格式的图形性质,在PDF中查找文本本质上很难 - 您搜索的字母在文件中可能不是连续的。也就是说, CAM :: PDF 具有一些搜索替换功能和启发式功能。尝试 changepagestring.pl ,看看它是否适用于您的PDF。

Finding text in a PDF can be inherently hard because of the graphical nature of the document format -- the letters you are searching for may not be contiguous in the file. That said, CAM::PDF has some search-replace capabilities and heuristics. Give changepagestring.pl a try and see if it works on your PDFs.

这篇关于如何编写文本搜索和替换PDF文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆