如何从PDF中删除所有图像? [英] How can I remove all images from a PDF?

查看:73
本文介绍了如何从PDF中删除所有图像?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从PDF文件中删除所有图像.

I want to remove all images from a PDF file.

页面布局不应更改.所有图像均应替换为空白.

The page layouts should not change. All images should be replaced by empty space.

  • 如何借助Ghostscript和适当的PostScript代码来实现?

推荐答案

我自己提出答案,但实际代码由

I'm putting up the answer myself, but the actual code is by courtesy of Chris Liddell, Ghostscript developer.

我使用了他原来的Pos​​tScript代码,并剥离了其他功能. 仅保留删除光栅图像的功能. 其他图形页面对象(文本部分,图案和矢量对象)应保持不变.

I used his original PostScript code and stripped off its other functions. Only the function which removes raster images remains. Other graphical page objects -- text sections, patterns and vector objects -- should remain untouched.

复制以下代码并将其保存为remove-images.ps:

Copy the following code and save it as remove-images.ps:

%!PS

% Run as:
%
%      gs ..... -dFILTERIMAGE -dDELAYBIND -dWRITESYSTEMDICT \
%                 ..... remove-images.ps <your-input-file>
%
% derived from Chris Liddell's original 'filter-obs.ps' script
% Adapted by @pdfkungfoo (on Twitter)

currentglobal true setglobal

32 dict begin

/debugprint     { systemdict /DUMPDEBUG .knownget { {print flush} if} 
                {pop} ifelse } bind def

/pushnulldevice {
  systemdict exch .knownget not
  {
    //false
  } if

  {
    gsave
    matrix currentmatrix
    nulldevice
    setmatrix
  } if
} bind def

/popnulldevice {
  systemdict exch .knownget not
  {
    //false
  } if
  {
    % this is hacky - some operators clear the current point
    % i.e.
    { currentpoint } stopped
    { grestore }
    { grestore moveto} ifelse
  } if
} bind def

/sgd {systemdict exch get def} bind def

systemdict begin

/_image /image sgd
/_imagemask /imagemask sgd
/_colorimage /colorimage sgd

/image {
   (\nIMAGE\n) //debugprint exec /FILTERIMAGE //pushnulldevice exec
  _image
  /FILTERIMAGE //popnulldevice exec
} bind def

/imagemask
{
  (\nIMAGEMASK\n) //debugprint exec
  /FILTERIMAGE //pushnulldevice exec
  _imagemask
  /FILTERIMAGE //popnulldevice exec
} bind def

/colorimage
{
  (\nCOLORIMAGE\n) //debugprint exec
  /FILTERIMAGE //pushnulldevice exec
  _colorimage
  /FILTERIMAGE //popnulldevice exec
} bind def

end
end

.bindnow

setglobal

现在运行以下命令:

gs -o no-more-images-in-sample.pdf \
   -sDEVICE=pdfwrite               \
   -dFILTERIMAGE                   \
   -dDELAYBIND                     \
   -dWRITESYSTEMDICT               \
    remove-images.ps               \
    sample.pdf

我使用官方的PDF规范测试了该代码,并且该代码有效. 以下两个屏幕截图显示了输入和输出PDF的第750页:

I tested the code with the official PDF specification, and it worked. The following two screenshots show page 750 of input and output PDFs:

如果您想知道为什么看起来像图像的东西仍然出现在输出页面上: 它实际上不是光栅图像,而是原始文件中的'pattern',因此不会被删除.

If you wonder why something that looks like an image is still on the output page: it is not really a raster image, but a 'pattern' in the original file, and therefor it is not removed.

这篇关于如何从PDF中删除所有图像?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆