从pdf提取区域 [英] Extract area from pdf

查看:109
本文介绍了从pdf提取区域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想从pdf页面中提取x-y坐标给定的区域.提取的区域可以作为页面存储在新的pdf文档中.这需要做几次,所以我希望编写脚本过程.是否有任何工具/库可以帮助您做到这一点?

I want to extract an area given by x-y coordinates from a pdf page. The extracted area may be stored as a page in a new pdf document. This needs to be done several times and so I would want the process to be scripted. Are there any tools / libraries that can help do this?

推荐答案

如果您可以接受iText(用于Java)或iText(Sharp)(用于.Net)库,则可以使用它们从某些库中导入现有页面.以PDF为模板,其中的部分可以显示在另一个PDF中.

If iText (for Java) or iText(Sharp) (for .Net) are acceptable libraries for you, you can use them to import an existing page from some PDF as a template of which sections can be displayed in another PDF.

看看示例 TilingHero.java / TilingHero.cs . com/lowagie2/samplechapter6.pdf"rel =" nofollow>第6章, iText in Action — 2nd Edition .中央代码是:

Have a look at the example TilingHero.java / TilingHero.cs from chapter 6 of iText in Action — 2nd Edition. The central code is:

PdfImportedPage page = writer.getImportedPage(reader, 1);
// adding the same page 16 times with a different offset
float x, y;
for (int i = 0; i < 16; i++) {
    x = -pagesize.getWidth() * (i % 4);
    y = pagesize.getHeight() * (i / 4 - 3);
    content.addTemplate(page, 4, 0, 0, 4, x, y);
    document.newPage();
}

如您所见,原始页面被导入一次,并且其不同部分显示在不同页面上.

As you see, the original page is imported once and different sections of it are displayed on different pages.

(iText和iTextSharp可以免费获得---根据AGPL的要求---或可以商业获得)

(iText and iTextSharp are available either for free --- subject to the AGPL --- or commercially)

这篇关于从pdf提取区域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆