将可搜索的PDF转换为不可搜索的PDF [英] Converting searchable PDF to a non-searchable PDF

查看:579
本文介绍了将可搜索的PDF转换为不可搜索的PDF的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个可搜索的PDF,我需要将其转换为不可搜索的PDF.

I have a PDF which is searchable and I need to convert it into a non-searchable one.

我尝试使用Ghostscript,然后将其更改为JPEG,然后又更改为PDF,这可以解决问题,但是文件大小太大,无法接受.

I tried using Ghostscript and change it to JPEG and then back to PDF which does the trick but the file size is way too large and not acceptable.

我尝试使用Ghostscript先将PDF转换为PS,然后再使用PDF进行转换,但效果并不理想.

I tried using Ghostscript to convert the PDF to PS first and then PDF which does the trick as well but the quality is not good enough.

gswin32.exe -q -dNOPAUSE -dBATCH -dSAFER -sDEVICE=pswrite -r1000 -sOutputFile=out.ps in.pdf
gswin32.exe -q -dNOPAUSE -dBATCH -dSAFER -dDEVICEWIDTHPOINTS=596 -dDEVICEHEIGHTPOINTS=834 -dPDFSETTINGS=/ebook -sDEVICE=pdfwrite -sOutputFile=out.pdf out.ps

有没有一种方法可以使PDF的质量更好?

Is there a way to give a good quality to the PDF?

或者,有一种更简便的方法将可搜索的PDF转换为不可搜索的PDF?

Alternatively is there an easier way to convert a searchable PDF to a non-searchable one?

推荐答案

您可以使用Ghostscript来实现.您需要2个步骤:

You can use Ghostscript to achieve that. You need 2 steps:

  1. 将PDF转换为PostScript文件,该文件将所有使用的字体转换为轮廓形状.关键是-dNOCACHE参数:


gs -o somepdf.ps -dNOCACHE -sDEVICE=pswrite somepdf.pdf

将PS转换回PDF(并且可能再次删除中间PS):

Convert the PS back to PDF (and, maybe delete the intermediate PS again):


gs -o somepdf-with-outlines.pdf -sDEVICE=pdfwrite somepdf.ps
rm somepdf.ps

请注意,生成的PDF很可能会比原始PDF大. (并且,如果没有其他命令行参数,则可能还会根据Ghostscript内置默认值转换原始PDF中的所有图像,除非您添加更多的命令行参数来执行此操作.但是质量应该比您自己尝试使用Ghostscript更好. ...)

Note, that the resulting PDF will very likely be larger than the original one. (And, without additional command line parameters, all images in the original PDF will likely also be converted according to Ghostscript builtin defaults, unless you add more command line parameters to do otherwise. But the quality should be better than your own attempt to use Ghostscript...)

显然,从9.15版(将于2014年9月/10月发布)开始, Ghostscript 将支持新的命令行参数:

Apparently, from version 9.15 (to be released during September/October 2014), Ghostscript will support a new command line parameter:

 -dNoOutputFonts

这将导致输出设备pdfwriteps2writeeps2write "将字形展平"为基本"标记操作(而不是将字体写入输出)..

which will cause the output devices pdfwrite, ps2write and eps2write "to 'flatten' glyphs into 'basic' marking operations (rather than writing fonts to the output)".

这意味着可以避免上述两个步骤,并且只需一个命令即可实现所需的结果:

This means that the above two steps can be avoided, and the desired result be achieved with a single command:

 gs -o somepdf-with-outlines.pdf -dNoOutputFonts -sDEVICE=pdfwrite somepdf.pdf

注意事项::我已经使用基于当前Git来源的自编译Ghostscript对一些输入文件进行了测试.在每种情况下它都可以完美地工作.

Caveats: I've tested this with a few input files using a self-compiled Ghostscript based on current Git sources. It worked flawlessly in each case.

这篇关于将可搜索的PDF转换为不可搜索的PDF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆