寻找Linux PDF库以从PDF提取注释和图像 [英] Looking for a linux PDF library to extract annotations and images from a PDF

查看：100 发布时间：2020/5/25 4:19:09 pdf

本文介绍了寻找Linux PDF库以从PDF提取注释和图像的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在寻找一个免费的库(Java/Ruby)，该库可以在linux上运行，并且可以从PDF中提取图像和注释；与 CGPDFDocument 在OS X上可以执行的操作类似.

I'm looking for a free library (Java/Ruby), that can run on linux, and can extract images and annotations from PDFs; similar to what CGPDFDocument can do on OS X.

谢谢！

推荐答案

我不了解图像，但是使用了最新版本的ruby

I don't know about images, but using the last version of the ruby pdfreader library I was able to succesfully extract the annotations from a big PDF file:

PDF::Reader.open(filename) do |reader|
  reader.pages.each do |page|
    annots_ref = page.attributes[:Annots]
    actual_annots = reader.objects[annots_ref]
    if actual_annots && actual_annots.size > 0
      actual_annots.each do |annot_ref|
        actual_annot = reader.objects[annot_ref]
          unless actual_annot[:Contents].nil?
            puts "Page #{page.number},"+actual_annot[:Contents].inspect
          end
        end
    end
  end       
end

我想可以做类似的事情来提取图像.

I imagine that something like it could be done to extract images.

这篇关于寻找Linux PDF库以从PDF提取注释和图像的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

寻找Linux PDF库以从PDF提取注释和图像 [英] Looking for a linux PDF library to extract annotations and images from a PDF

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

寻找Linux PDF库以从PDF提取注释和图像 [英] Looking for a linux PDF library to extract annotations and images from a PDF

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭