iText的文本提取/从Android上的PDF阅读 [英] iText as text Extracting/Reading from PDF on android

查看:285
本文介绍了iText的文本提取/从Android上的PDF阅读的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在与iText的一个问题。其他人说,iText的仅用于PDF创建?它无法读取或从PDF提取文本。真的吗?

I'm having a problem with iText. Other people say that iText is for PDF Creation only? and it can not read or extract text from a PDF. is that true?

如果这是真的那么什么其他的选择,我可以选择从PDF文件中提取文本并保存在一个变量或在Android设备上显示呢?

If it is true then what are other options i can choose to EXTRACT text from PDF File and Save it on a Variable or Display it in Android device?

如果iText的是能够从PDF提取文本,然后怎么样?

If iText is capable of Extracting text from PDF, then HOW?

推荐答案

iText的可以提取PDF文件的文本。虽然这是事实,它起源于创造新的和操纵现有的PDF文件的工具,它在最近几年也变得越来越好于提取文本。显然,这意味着你应该使用文本提取当前的iText版本(5.3.x)。

iText can extract text from PDFs. While it is true that it originated as a tool to create new and manipulate existing PDFs, it in the recent years also has become better and better at extracting text. This obviously implies that you should use a current iText version (5.3.x) for text extraction.

这本书的iText在行动,第二版由主开发商的iText,布鲁诺Lowagie,解释第15章基本iText的文本提取,并从章样品中的iText的Sourceforge的SVN仓库,参见可用样品为第15章。一个好的起点是的 ExtractPageContentSorted2 中提取一整页的文字。

The book "iText in Action, second edition" by the main iText developer, Bruno Lowagie, explains basic iText text extraction in chapter 15, and the samples from that chapter are available in the iText Sourceforge SVN repository, cf. Samples for chapter 15. A good starting point is ExtractPageContentSorted2 which extracts the text of a whole page.

如果您有特殊要求,你可以使用的 ExtractPageContentSorted1 为出发点,明确定义了一个文本提取策略;根据您的要求,您将需要自己的startegy。如果你想从一个特定区域的纯文本,看的 ExtractPageContentArea

If you have special requirements, you may use ExtractPageContentSorted1 as a starting point which explicitly defines a text extraction strategy; depending on your requirements you will need your own startegy. If you want the text from a specific region only, look at ExtractPageContentArea.

要真正微调iText的文字提取功能,你应该有一个看的iText个问题的邮件列表存档(如的在nabble.com )作为最近的iText文本提取API扩展到服务更多的用例。

To really fine tune the text extraction capabilities of iText, you should have a look at the itext-question mailing list archive (e.g. at nabble.com) as recently the iText text extraction API was extended to serve additional use cases.

这篇关于iText的文本提取/从Android上的PDF阅读的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆