如何使用Carrierwave和MiniMagick(Ruby on Rails)将PDF转换为图像数组 [英] How to convert a PDF into an array of images, with Carrierwave and MiniMagick (Ruby on Rails)
问题描述
我正在将上载的PDF转换为图像,每页一张.我已经弄清楚了如何使用MiniMagick::Tool::Convert
生成图像,但是我不知道如何为Uploader编写version
块,以便可以访问图像URL数组.
I'm converting uploaded PDFs into images, with one image per page. I have figured out how to generate the images using MiniMagick::Tool::Convert
, but I don't know how to write the version
block for the Uploader, so that I can access an array of image URLs.
到目前为止,这是我的上载器:
Here's my uploader so far:
class DocumentUploader < CarrierWave::Uploader::Base
include CarrierWave::MiniMagick
storage :file
# storage :fog
def store_dir
"uploads/#{model.class.to_s.underscore}/#{mounted_as}/#{model.id}"
end
version :jpg do
process :convert_to_images
process :set_content_type_jpg
def convert_to_images(*args)
image = MiniMagick::Image.open(current_path)
image.pages.each_with_index do |page, index|
MiniMagick::Tool::Convert.new do |convert|
convert.background 'white'
convert.flatten
convert.density 300
convert.quality 95
convert << page.path
convert << "#{CarrierWave.root}/#{store_dir}/image-#{index}.jpg"
end
end
end
end
def set_content_type_jpg(*args)
self.file.instance_variable_set(:@content_type, "image/jpg")
end
# Add a white list of extensions which are allowed to be uploaded.
def extension_white_list
%w(jpg jpeg gif png doc docx pdf)
end
end
这将在正确的目录中生成image-0.jpg
,image-1.jpg
等.但是现在我无法在视图中引用这些图像,甚至无法知道有多少图像.当我需要将图像上传到S3时,这也将不起作用.如何让Carrierwave处理该图像集合的文件存储,而不是单个图像?
This generates image-0.jpg
, image-1.jpg
, etc. in the correct directory. But now I have no way of referencing those images in my views, or even knowing how many there are. This will also not work when I need to upload the images to S3. How can I get Carrierwave to handle the file storage for this collection of images, instead of a single image?
看起来我可能还需要添加一个新的数据库列来存储页面数.有没有一种方法可以使我的上传者根据此计数返回图像URL的数组?
It also looks like I will probably need to add a new database column to store the number of pages. Is there a way to make my uploader return an array of image URLs, based on this count?
我也愿意切换到另一个宝石.使用回形针,神rine或重新归档,这会更容易吗?
I'm also willing to switch to another gem. Is this something that would be easier with Paperclip, Shrine, or Refile?
推荐答案
使用Shrine,您可以使每个页面具有不同的版本:
With Shrine you can make each page a different version:
class ImageUploader < Shrine
plugin :versions
plugin :processing
process(:store) do |io, context|
pdf = io.download
versions = {}
image = MiniMagick::Image.new(pdf.path)
image.pages.each_with_index do |page, index|
page_image = Tempfile.new("version-#{index}", binmode: true)
MiniMagick::Tool::Convert.new do |convert|
convert.background 'white'
convert.flatten
convert.density 300
convert.quality 95
convert << page.path
convert << page_image.path
end
page_image.open # refresh updated file
versions[:"page_#{index + 1}"] = page_image
end
versions
end
end
假设您具有Document
模型,并且将PDF附加到file
附件字段,则可以使用Hash#values
检索页面页面:
Assuming you have a Document
model and you attached a PDF to a file
attachment field, you can then retrieve an array of pages using Hash#values
:
pages = document.file.values
pages #=> [...array of pages...]
pages.count #=> number of pages
这篇关于如何使用Carrierwave和MiniMagick(Ruby on Rails)将PDF转换为图像数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!