使用ImageMagick和/或GhostScript将多页PDF转换为多个JPG [英] Converting multi-page PDFs to several JPGs using ImageMagick and/or GhostScript

查看:1545
本文介绍了使用ImageMagick和/或GhostScript将多页PDF转换为多个JPG的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将多页PDF文件转换为一堆JPEG,一个用于PDF中的每个页面。我花了几个小时看着如何做到这一点,最终我发现我需要安装Ghostscript。所以我这样做了(来自这个网站: http://downloads.ghostscript.com/public/ 和我使用了2012年2月8日最新的链接ghostscript-9.05.tar.gz。

I am trying to convert a multi-page PDF file into a bunch of JPEGs, one for each page in the PDF. I have spent hours and hours looking up how to do this, and eventually I discovered that I need Ghostscript installed. So I did that (from this website: http://downloads.ghostscript.com/public/ And I used the most recent link "ghostscript-9.05.tar.gz" from Feb 8, 2012).

然而,即使安装/下载了这个链接,我仍然无法使用做我想做的事。我应该将此保存在特殊的地方,例如与ImageMagick在同一文件夹中吗?

However, even with this installed/downloaded, I am still unable to do what I want. Should I have this saved somewhere special, like in the same folder as ImageMagick?

到目前为止,我所知道的是:

What I have figured out so far is this:


  • 在命令提示符中,我将工作目录更改为保存它的ImageMagick文件夹。

  • In Command Prompt I change the working directory to the ImageMagick folder, where that is saved.

然后键入

convert "<full file path to pdf>" "<full file path to jpg>"


接下来是一大堆错误。它始于:

This is followed by a giant blob of error. It begins with:

    Unrecoverable error: rangecheck in.setuserparams
    Operand stack:

随后出现一些难以理解的数字和上限。它结束于:

Followed by a blurb of unreadable numbers and caps. It ends with:

    While reading gs_lev2.ps:
    %%[ Error: invalidaccess; OffendingCommand: put ]%%

毋庸置疑,经过几个小时的审议后,我不会我认为我更接近于将这个PDF转换为JPG这个看似简单的任务。

Needless to say, after hours and hours of deliberation, I don't think I am any closer to doing the seemingly simple task of converting this PDF into a JPG.

我想要的是如何使这项工作得到一步一步的说明。不要遗漏任何东西,无论它看起来多么明显(特别是涉及ghostscript的任何东西)。这几个月一直困扰着我和我的主管。

What I would like are some step by step instructions on how to make this work. Don't leave out anything, no matter how "obvious" it might seem (especially anything involving ghostscript). This has been troubling me and my supervisor for months now.

为了进一步说明,我们使用的是Windows XP操作系统。最终的目的是在R(统计语言)中调用这些命令行,并在脚本中运行它。此外,我已经能够成功地将JPG转换为PNG格式,反之亦然,但PDF只是不起作用。

For further clarification, we are on a Windows XP operating system. The eventual intention is to call these command lines in R, the statistical language, and run it in a script. In addition, I have been able to successfully convert JPGs to PNG format and vice versa, but PDF just is not working.

帮助!!!

推荐答案

你不需要ImageMagick,Ghostscript可以独自完成。 (如果您使用ImageMagick,它本身无法进行转换, HAS 使用Ghostscript作为其'委托'。)

You don't need ImageMagick for this, Ghostscript can do it all alone. (If you used ImageMagick, it couldn't do that conversion itself, it HAS to use Ghostscript as its 'delegate'.)

试试这个直接使用Ghostscript:

Try this for directly using Ghostscript:

 c:\path\to\gswin32c.exe ^
   -o page_%03d.jpg ^
   -sDEVICE=jpeg ^
    d:/path/to/input.pdf

这将为每个页面创建一个新的JPEG,文件名将增加为 page_001.jpg page_002.jpg ,...

This will create a new JPEG for each page, and the filenames will increment as page_001.jpg, page_002.jpg,...

注意,这也将创建使用 jpeg 设备的所有默认设置的JPEG (最重要的一个是分辨率为72dpi)。

Note, this will also create JPEGs which use all the default settings of the jpeg device (one of the most important ones will be that the resolution will be 72dpi).

如果你需要更高(或更低分辨率)的图像,你可以添加其他选项:

If you need higher (or lower resolution) for your images, you can add other options:

 gswin32c.exe ^
   -o page_%03d.jpg ^
   -sDEVICE=jpeg ^
   -r300 ^
   -dJPEGQ=100 ^
    d:/path/to/input.pdf

-r300 将分辨率设置为300dpi, -dJPEGQ = 100 设置最高的JPEG质量等级(Ghostscript的默认值为75)。

-r300 sets the resolution to 300dpi and -dJPEGQ=100 sets the highest JPEG quality level (Ghostscript's default is 75).

还请注意: JPEG不适合用于表示形状锋利的边缘和高质量的高对比度(例如,您通常会看到带有小字符的黑白文本页面)。

Also note, please: JPEG is not well suited to represent shapes with sharp edges and high contrast in good quality (such as you typically see in black-on-white text pages with small characters).

(有损)JPEG压缩方法 针对连续色调 图片+照片进行了优化,而不是针对线条图形进行了优化。因此,对于主要包含文本的PostScript或PDF输入页面,它是次优的。这里,即使输入非常好,JPEG格式的有损压缩也会导致较差的输出质量。有关此主题的更多详细信息,另请参见 JPEG常见问题解答

The (lossy) JPEG compression method is optimized for continuous-tone pictures + photos, and not for line graphics. Therefore it is sub-optimal for such PostScript or PDF input pages which mainly contain text. Here, the lossy compression of the JPEG format will result in poorer quality output even if the input is excellent. See also the JPEG FAQ for more details on this topic.

选择PNG作为输出格式(PNG使用无损压缩)可以获得更好的图像输出:

You may get better image output by choosing PNG as the output format (PNG uses a lossless compression):

 gswin32c.exe ^
   -o page_%03d.png ^
   -sDEVICE=png16m ^
   -r150 ^
    d:/path/to/input.pdf

png16m 设备产生24位RGB颜色。您可以将此交换为 pnggray (对于纯灰度输出), png256 (对于8位颜色), png16 (4位颜色), pngmono (仅限黑白)或 pngmonod (替代黑白模块)。

The png16m device produces 24bit RGB color. You could swap this for pnggray (for pure grayscale output), png256 (for 8-bit color), png16 (4-bit color), pngmono (black and white only) or pngmonod (alternative black-and-white module).

这篇关于使用ImageMagick和/或GhostScript将多页PDF转换为多个JPG的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆