如何从PDF中提取文本? [英] How to extract text from a PDF?

查看：122 发布时间：2020/5/25 3:46:28 pdf text ghostscript extraction text-extraction

本文介绍了如何从PDF中提取文本?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

谁能推荐一个用于从PDF中提取文本和图像的库/API? 我们需要能够获取文档的已知区域中包含的文本，因此API将需要向我们提供页面上每个元素的位置信息.

Can anyone recommend a library/API for extracting the text and images from a PDF? We need to be able to get at text that is contained in pre-known regions of the document, so the API will need to give us positional information of each element on the page.

我们希望该数据以xml或json格式输出.我们目前正在查看的 PdfTextStream 看起来不错，但是希望听听其他人的经验和建议.

We would like that data to be output in xml or json format. We're currently looking at PdfTextStream which seems pretty good, but would like to hear other peoples experiences and suggestions.

是否可以通过编程方式从pdf中提取文本(商业或免费)?

如何从PDF中提取文本? [英] How to extract text from a PDF?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何从PDF中提取文本? [英] How to extract text from a PDF?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭