PDF文件操作 [英] PDF document manipulation

查看:106
本文介绍了PDF文件操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有几个具有以下属性的PDF:

I have several PDFs with the following properties:

每个PDF都包含可变数量的文档",它们具有不同的页数.

Each PDF contains a variable number of "documents" with differing number of pages.

文档"中的每个页面都有诸如"26页中的第3页"之类的文本.

Each page in a "document" has text such as "Page 3 of 26".

我希望能够自动识别PDF中每个文档"的首页和最后一页(注意:这与PDF的首页和最后一页不同,因为每个PDF可能包含多个文档" ),然后将其提取到新的PDF中,以供以后打印和存档.

I want to be able to automatically identify the first and last page of each "document" within a PDF (Note: this is not the same as the first and last page of a PDF as each PDF may contain several "documents") and extract these into a new PDF for later printing and archival.

我不确定我可以带些什么工具来解决这个问题,以及哪些库可以用来解决这个问题.

I'm not sure what tools I can bring to bear on this problem and what libraries are available to tackle this.

有什么建议吗?最好是免费的,可用于创建将在Windows上运行的工具.

Any recommendations? Preferably free and can be used to create a tool that will run on Windows.

推荐答案

Java有一个不错的免费pdf库.查看 iText .

Java has a nice free pdf library. Check out iText.

在iText网站上:

您可以使用iText进行以下操作:

You can use iText to:

  • 将PDF提供给浏览器
  • 从XML文件或数据库生成动态文档
  • 使用PDF的许多交互式功能
  • 添加书签,页码,水印等
  • 拆分,连接和处理PDF页面
  • 自动填写PDF表单
  • 将数字签名添加到PDF文件
  • 还有更多...
  • Serve PDF to a browser
  • Generate dynamic documents from XML files or databases
  • Use PDF's many interactive features
  • Add bookmarks, page numbers, watermarks, etc.
  • Split, concatenate, and manipulate PDF pages
  • Automate filling out of PDF forms
  • Add digital signatures to a PDF file
  • And much more...

由于它是Java,因此在Windows或其他任何地方运行都不会有问题.

Since it's Java, there should be no issues running on Windows, or anywhere else for that matter.

这篇关于PDF文件操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆