如何使用Carrierwave和MiniMagick(Ruby on Rails)将PDF转换为图像数组 [英] How to convert a PDF into an array of images, with Carrierwave and MiniMagick (Ruby on Rails)

查看:106
本文介绍了如何使用Carrierwave和MiniMagick(Ruby on Rails)将PDF转换为图像数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在将上载的PDF转换为图像,每页一张.我已经弄清楚了如何使用MiniMagick::Tool::Convert生成图像,但是我不知道如何为Uploader编写version块,以便可以访问图像URL数组.

I'm converting uploaded PDFs into images, with one image per page. I have figured out how to generate the images using MiniMagick::Tool::Convert, but I don't know how to write the version block for the Uploader, so that I can access an array of image URLs.

到目前为止,这是我的上载器:

Here's my uploader so far:

class DocumentUploader < CarrierWave::Uploader::Base
  include CarrierWave::MiniMagick

  storage :file
  # storage :fog

  def store_dir
    "uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
  end

  version :jpg do
    process :convert_to_images
    process :set_content_type_jpg

    def convert_to_images(*args)
      image = MiniMagick::Image.open(current_path)
      image.pages.each_with_index do |page, index|
        MiniMagick::Tool::Convert.new do |convert|
          convert.background 'white'
          convert.flatten
          convert.density 300
          convert.quality 95
          convert << page.path
          convert << "#{CarrierWave.root}/#{store_dir}/image-#{index}.jpg"
        end
      end
    end
  end

  def set_content_type_jpg(*args)
    self.file.instance_variable_set(:@content_type, "image/jpg")
  end

  # Add a white list of extensions which are allowed to be uploaded.
  def extension_white_list
    %w(jpg jpeg gif png doc docx pdf)
  end
end

这将在正确的目录中生成image-0.jpgimage-1.jpg等.但是现在我无法在视图中引用这些图像,甚至无法知道有多少图像.当我需要将图像上传到S3时,这也将不起作用.如何让Carrierwave处理该图像集合的文件存储,而不是单个图像?

This generates image-0.jpg, image-1.jpg, etc. in the correct directory. But now I have no way of referencing those images in my views, or even knowing how many there are. This will also not work when I need to upload the images to S3. How can I get Carrierwave to handle the file storage for this collection of images, instead of a single image?

看起来我可能还需要添加一个新的数据库列来存储页面数.有没有一种方法可以使我的上传者根据此计数返回图像URL的数组?

It also looks like I will probably need to add a new database column to store the number of pages. Is there a way to make my uploader return an array of image URLs, based on this count?

我也愿意切换到另一个宝石.使用回形针,神rine或重新归档,这会更容易吗?

I'm also willing to switch to another gem. Is this something that would be easier with Paperclip, Shrine, or Refile?

推荐答案

使用Shrine,您可以使每个页面具有不同的版本:

With Shrine you can make each page a different version:

class ImageUploader < Shrine
  plugin :versions
  plugin :processing

  process(:store) do |io, context|
    pdf      = io.download
    versions = {}

    image = MiniMagick::Image.new(pdf.path)
    image.pages.each_with_index do |page, index|
      page_image = Tempfile.new("version-#{index}", binmode: true)
      MiniMagick::Tool::Convert.new do |convert|
        convert.background 'white'
        convert.flatten
        convert.density 300
        convert.quality 95
        convert << page.path
        convert << page_image.path
      end
      page_image.open # refresh updated file
      versions[:"page_#{index + 1}"] = page_image
    end

    versions
  end
end

假设您具有Document模型,并且将PDF附加到file附件字段,则可以使用Hash#values检索页面页面:

Assuming you have a Document model and you attached a PDF to a file attachment field, you can then retrieve an array of pages using Hash#values:

pages = document.file.values
pages #=> [...array of pages...]
pages.count #=> number of pages

这篇关于如何使用Carrierwave和MiniMagick(Ruby on Rails)将PDF转换为图像数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆