在Ruby Search PDF中,突出显示找到的文本,导出页面的JPG [英] In Ruby Search PDF, highlight text found, export JPG of the page

查看:108
本文介绍了在Ruby Search PDF中,突出显示找到的文本,导出页面的JPG的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想看看是否有人这样做.

I wanted to see if anyone has done this.

在红宝石中,我想打开PDF并在其中搜索文本.找到的任何我想用黄色突出显示的文本,然后将找到该文本的页面返回为jpg.有人做过吗?

In ruby, i'd like to open a PDF and search for text there. Any text that I find I would like to highlight in yellow, then return the page(s) where I found the text as a jpg. Has anyone done this before?

谢谢, 克雷格

推荐答案

如果您乐于使用c扩展名,则可以使用ruby-gnome2绑定来实现.您将需要poppler和gdk_pixbuf2宝石.

If you're happy to use a c-extension you can achieve this with the ruby-gnome2 bindings. You'll need the poppler and gdk_pixbuf2 gems.

这些宝石的API文档有些琐,但您可以在 http://ruby-gnome2.sourceforge.jp/

The API docs for these gems are a little skimpy, but you can find what there is at http://ruby-gnome2.sourceforge.jp/

require 'poppler'
require 'gdk_pixbuf2'

SCALE = 2

filename = "source.pdf"
doc = Poppler::Document.new(filename)
page = doc.get_page(0)

# render the page to an in-memory buffer
width, height = *page.size
buf = Gdk::Pixbuf.new(Gdk::Pixbuf::COLORSPACE_RGB, true, 8, width*SCALE, height*SCALE)
page.render(0, 0, width*SCALE, height*SCALE, SCALE, 0, buf)

# copy the rendered buffer into an pixmap for further editing
map = Gdk::Pixmap.new(nil, width*SCALE, height*SCALE, 24)
map.draw_pixbuf(nil, buf, 0, 0, 0, 0, -1, -1, Gdk::RGB::DITHER_NONE, 0, 0)

# setup highlight color and blend function
gc  = Gdk::GC.new(map) # graphics context
gc.rgb_fg_color = Gdk::Color.new(65535, 65535, 0)
gc.function = Gdk::GC::AND

# find each match and highlight it. The co-ordinate maths is ugly but
# necesary to convert from PDF co-ords to Pixmap co-ords
page.find_text("the").each do |match|
  matchx = match.x1 * SCALE
  matchy = (height - match.y2) * SCALE
  matchw = (match.x2-match.x1) * SCALE
  matchh = (match.y2-match.y1) * SCALE
  map.draw_rectangle(gc, true, matchx, matchy, matchw, matchh)
end

# save the buffer to a JPG
newbuf = Gdk::Pixbuf.from_drawable(nil, map, 0, 0, width*SCALE, height*SCALE)
newbuf.save("foo.jpg", "jpeg")

这篇关于在Ruby Search PDF中,突出显示找到的文本,导出页面的JPG的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆