通过坐标提取PDF页面的区域 [英] Extract a region of a PDF page by coordinates

查看:493
本文介绍了通过坐标提取PDF页面的区域的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个工具来提取1页PDF文件的给定矩形区域(通过坐标),并生成具有指定区域的1页PDF文件:

I am looking for a tool to extract a given rectangular region (by coordinates) of a 1-page PDF file and produce a 1-page PDF file with the specified region:

# in.pdf is a 1-page pdf file
extract file.pdf 0 0 100 100 > out.pdf
# out.pdf is now a 1-page pdf file with a page of size 100x100
# it contains the region (0, 0) to (100, 100) of file.pdf

我可以将PDF转换为图像,并使用 convert ,但是这意味着生成的PDF不再是矢量化的,这是不可接受的(我想要能够缩放)。

I could convert the PDF to an image and use convert, but this would mean that the resulting PDF would not be vectorial anymore, which is not acceptable (I want to be able to zoom).

我理想地喜欢使用命令行工具或Python库执行此任务。

I would ideally like to perform this task with a command-line tool or a Python library.

谢谢!

推荐答案

以下脚本可在
中找到 http:// snipplr .com / view.php?codeview& id = 18924
将每页的pdf分成2个。

The following script found in http://snipplr.com/view.php?codeview&id=18924 splits each page of a pdf into 2.

#!/usr/bin/env perl
use strict; use warnings;
use PDF::API2;

my $filename = shift;
my $oldpdf = PDF::API2->open($filename);
my $newpdf = PDF::API2->new;

for my $page_nb (1..$oldpdf->pages) {
  my ($page, @cropdata);

  $page = $newpdf->importpage($oldpdf, $page_nb);
  @cropdata = $page->get_mediabox;
  $cropdata[2] /= 2;
  $page->cropbox(@cropdata);
  $page->trimbox(@cropdata);
  $page->mediabox(@cropdata);

  $page = $newpdf->importpage($oldpdf, $page_nb);
  @cropdata = $page->get_mediabox;
  $cropdata[0] = $cropdata[2] / 2;
  $page->cropbox(@cropdata);
  $page->trimbox(@cropdata);
  $page->mediabox(@cropdata);
}

(my $newfilename = $filename) =~ s/(.*)\.(\w+)$/$1.clean.$2/;
$newpdf->saveas('destination_path/myfile.pdf');

这篇关于通过坐标提取PDF页面的区域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆