如何从 PDF 中删除所有图像? [英] How can I remove all images from a PDF?

查看:22
本文介绍了如何从 PDF 中删除所有图像?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从 PDF 文件中删除所有图像.

I want to remove all images from a PDF file.

页面布局不应改变.所有图像都应替换为空白区域.

The page layouts should not change. All images should be replaced by empty space.

  • 如何在 Ghostscript 和适当的 PostScript 代码的帮助下实现这一点?

推荐答案

我自己提出了答案,但实际代码由 Chris Liddell,Ghostscript 开发人员.

I'm putting up the answer myself, but the actual code is by courtesy of Chris Liddell, Ghostscript developer.

我使用了他的原始 PostScript 代码并剥离了它的其他功能.只保留删除光栅图像的功能.其他图形页面对象——文本部分、图案和矢量对象——应该保持不变.

I used his original PostScript code and stripped off its other functions. Only the function which removes raster images remains. Other graphical page objects -- text sections, patterns and vector objects -- should remain untouched.

复制以下代码并保存为remove-images.ps:

Copy the following code and save it as remove-images.ps:

%!PS

% Run as:
%
%      gs ..... -dFILTERIMAGE -dDELAYBIND -dWRITESYSTEMDICT 
%                 ..... remove-images.ps <your-input-file>
%
% derived from Chris Liddell's original 'filter-obs.ps' script
% Adapted by @pdfkungfoo (on Twitter)

currentglobal true setglobal

32 dict begin

/debugprint     { systemdict /DUMPDEBUG .knownget { {print flush} if} 
                {pop} ifelse } bind def

/pushnulldevice {
  systemdict exch .knownget not
  {
    //false
  } if

  {
    gsave
    matrix currentmatrix
    nulldevice
    setmatrix
  } if
} bind def

/popnulldevice {
  systemdict exch .knownget not
  {
    //false
  } if
  {
    % this is hacky - some operators clear the current point
    % i.e.
    { currentpoint } stopped
    { grestore }
    { grestore moveto} ifelse
  } if
} bind def

/sgd {systemdict exch get def} bind def

systemdict begin

/_image /image sgd
/_imagemask /imagemask sgd
/_colorimage /colorimage sgd

/image {
   (
IMAGE
) //debugprint exec /FILTERIMAGE //pushnulldevice exec
  _image
  /FILTERIMAGE //popnulldevice exec
} bind def

/imagemask
{
  (
IMAGEMASK
) //debugprint exec
  /FILTERIMAGE //pushnulldevice exec
  _imagemask
  /FILTERIMAGE //popnulldevice exec
} bind def

/colorimage
{
  (
COLORIMAGE
) //debugprint exec
  /FILTERIMAGE //pushnulldevice exec
  _colorimage
  /FILTERIMAGE //popnulldevice exec
} bind def

end
end

.bindnow

setglobal

现在运行这个命令:

gs -o no-more-images-in-sample.pdf 
   -sDEVICE=pdfwrite               
   -dFILTERIMAGE                   
   -dDELAYBIND                     
   -dWRITESYSTEMDICT               
    remove-images.ps               
    sample.pdf

我使用官方 PDF 规范测试了代码,并且成功了.以下两个屏幕截图显示了输入和输出 PDF 的第 750 页:

I tested the code with the official PDF specification, and it worked. The following two screenshots show page 750 of input and output PDFs:

如果您想知道为什么看起来像图像的东西仍然在输出页面上:它不是真正的光栅图像,而是原始文件中的'pattern',因此它不会被删除.

If you wonder why something that looks like an image is still on the output page: it is not really a raster image, but a 'pattern' in the original file, and therefor it is not removed.

这篇关于如何从 PDF 中删除所有图像?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆