通过坐标提取PDF页面的区域 [英] Extract a region of a PDF page by coordinates
问题描述
我正在寻找一个工具来提取1页PDF文件的给定矩形区域(通过坐标),并生成具有指定区域的1页PDF文件:
I am looking for a tool to extract a given rectangular region (by coordinates) of a 1-page PDF file and produce a 1-page PDF file with the specified region:
# in.pdf is a 1-page pdf file
extract file.pdf 0 0 100 100 > out.pdf
# out.pdf is now a 1-page pdf file with a page of size 100x100
# it contains the region (0, 0) to (100, 100) of file.pdf
我可以将PDF转换为图像,并使用 convert
,但是这意味着生成的PDF不再是矢量化的,这是不可接受的(我想要能够缩放)。
I could convert the PDF to an image and use convert
, but this would mean that the resulting PDF would not be vectorial anymore, which is not acceptable (I want to be able to zoom).
我理想地喜欢使用命令行工具或Python库执行此任务。
I would ideally like to perform this task with a command-line tool or a Python library.
谢谢!
推荐答案
以下脚本可在
中找到 http:// snipplr .com / view.php?codeview& id = 18924
将每页的pdf分成2个。
The following script found in http://snipplr.com/view.php?codeview&id=18924 splits each page of a pdf into 2.
#!/usr/bin/env perl
use strict; use warnings;
use PDF::API2;
my $filename = shift;
my $oldpdf = PDF::API2->open($filename);
my $newpdf = PDF::API2->new;
for my $page_nb (1..$oldpdf->pages) {
my ($page, @cropdata);
$page = $newpdf->importpage($oldpdf, $page_nb);
@cropdata = $page->get_mediabox;
$cropdata[2] /= 2;
$page->cropbox(@cropdata);
$page->trimbox(@cropdata);
$page->mediabox(@cropdata);
$page = $newpdf->importpage($oldpdf, $page_nb);
@cropdata = $page->get_mediabox;
$cropdata[0] = $cropdata[2] / 2;
$page->cropbox(@cropdata);
$page->trimbox(@cropdata);
$page->mediabox(@cropdata);
}
(my $newfilename = $filename) =~ s/(.*)\.(\w+)$/$1.clean.$2/;
$newpdf->saveas('destination_path/myfile.pdf');
这篇关于通过坐标提取PDF页面的区域的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!