创建数组使用PHP和ImageMagick的PDF文件通过图像填充 [英] Creating array populated by images from a PDF using PHP and ImageMagick

查看:135
本文介绍了创建数组使用PHP和ImageMagick的PDF文件通过图像填充的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图写一个程序,将采取对用户提交的PDF文件,并​​提取每一页的图像,然后填充这些图像阵列。我发现,所有附加页面一个图像几个例子,但是没有说做什么,我需要的。

I'm trying to write a routine that will take a PDF submitted by a user, and extract each page as an image and then populate an array with those images. I've found several examples that append all pages to one image, but none that do what I need.

这是我有什么,但它返回一个空数组:

This is what I have, but it returns an empty array:

function PdfToImg($pdf_in) {
    $img_array = array();
    $im = new imagick();
    $im->readimageblob($pdf_in); // reading image from binary string
    $num_pages = $im->getnumberimages();
    $im->setimageformat("png");

for ($x =1;$x <= $num_pages; $x++) {
    $img = $im->previousimage();
    $img_array .= $img;
    }
    return $img_array;
}

在这里的一个告诫的是,我不能将这些文件写入到磁盘中,必须使用字符串/数组。我通过ImageMagick的说明书看了看,并没有发现有关输出多个图像阵列,只有一系列保存到磁盘文件的任何东西。

One of the caveats here is that I can't write these files to disk, must use strings/arrays. I looked through the ImageMagick manual, and didn't find anything about outputting multiple images to an array, only to a series of files saved to disk.

更新:(06月13日)
我找到了一种方法来实现我所需要的,但它是丑陋的,低效的,我敢肯定慢,但似乎没有被任何其他方式。

UPDATE: (06/13/2012) I found a way to achieve what I need, but it's ugly, inefficient and I'm sure slow but there didn't seem to be any other way.

function PdfToImg3($pdf_in) {
    $img_array = array();
    $im = new imagick();
    $im->readimageblob($pdf_in);
    $num_pages = $im->getnumberimages();
    $i = 0;
    for($x = 1;$x <= $num_pages; $x++) {
        $im = new imagick();
        $im->readimageblob($pdf_in);
        $im->setiteratorindex($i);
        $im->setimageformat('png');
        $img_array[$x] = $im->getimageblob();
        $im->destroy();
        $i++;
    }
    $im->destroy();
    return $img_array;
}

生成一个数组名为$ img_array,与居住为PNG图像数据的串$ img_array键中输入PDF的页面。

Produces an array named $img_array, with the pages of the incoming PDF residing within keys of $img_array as strings of PNG image data.

必须有一个更好的办法,为什么不NEXTIMAGE()工作?为什么我不能用setIteratorIndex没有重新初始化/(创造新的?)每次imagick对象?我必须失去了一些东西,但也有在文档中大洞和谷歌,ImageMagick的论坛,也没有计算器知道这个被成功地做任何事情。

There MUST be a better way, why won't nextImage() work? Why can't I use setIteratorIndex without reinitializing/(creating new?) imagick objects each time? I must be missing something, but there are gaping holes in the documentation and Google, the ImageMagick forums, nor StackOverflow know anything about this being done successfully.

测试:非常慢,一个17页简单的PDF需要近1分钟

TESTED: Extremely slow, a 17 page simple PDF takes almost a minute.

更新2:(2012年7月11日)
完成大项目,这code位进入后,我决定回几个点并改善性能。这是我想出了:

UPDATE 2: (07/11/2012) After finishing the larger project that this code bit went into, I decided to return to a few points and improve upon the performance. This is what I came up with:

    $img_array = array();
    $im = new imagick();
    $im->readimageblob($pdf_in);
    $num_pages = $im->getnumberimages();
    $im->destroy();
    $i = 0;
    for($x = 1;$x <= $num_pages; $x++) {
        $im = new imagick();
        $im->readimageblob($pdf_in);
        $im->setResolution(300,300);
        $im->setiteratorindex($i);
        $im->setimageformat('png');
        $img_array[$x] = $im->getimageblob();
        $im->destroy();
        $i++;
    }
    return $img_array;

这个变化导致了4页复杂的PDF转换,从21-25秒下降至约2-3秒。我理解为什么有些变化帮助,对其他不是那么清楚。希望有人会觉得这非常有用。

This change resulted in a 4 page complex PDF conversion going from 21-25 seconds down to about 2-3 seconds. I understand why some of the changes helped, not so clear on the others. Hopefully someone will find this useful.

UPDATE3:想通了,为什么业绩上升这么多,动'setResolution下readImageBlob导致DPI设置被忽略,其默认值为72.注意这一点,让我感动的声明后面,它降低到150取得了类似的结果,但还有很多更好的性能。查看笔记php.net 这里

UPDATE3: Figured out why performance went up so much, moving 'setResolution to below 'readImageBlob' causes the DPI setting to be ignored, which defaults to 72. In note of this, I moved the declaration back, and reduced it to 150 and achieved similar results but still much better performance. See notes on php.net here.

推荐答案

这个阅读并摧毁斑点所有的时间都可能减缓我们失望了很多,其实我们并不需要他们所有,去皮code样子这样的:

This reading and destroying blobs all the time is probably slowing us down a lot, in fact we do not need them at all, peeled code looks like this:

$img_array = array();
$im = new imagick();
$im->setResolution(150,150);
$im->readImageBlob($pdf_in);
$num_pages = $im->getNumberImages();
for($i = 0;$i < $num_pages; $i++) 
{
    $im->setIteratorIndex($i);
    $im->setImageFormat('jpeg');
    $img_array[$i] = $im->getImageBlob();
 }
 $im->destroy();

这篇关于创建数组使用PHP和ImageMagick的PDF文件通过图像填充的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆