在C#中的PDF阅读程序 [英] Programmatic Reading of PDFs in C#

查看:116
本文介绍了在C#中的PDF阅读程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我看到有关使用C#生成PDF文件的许多问题和答案。结果我有一个相关的,但不同的任务。

I see many questions and answers about using C# to generate PDF files.
I have a related, but different task.

我有大量的PDF文件已经创建,我想验证使用正则表达式(RegExs)内容的某些部分。我想开在C#中的PDF文件,并​​能够读出东西接近线性方式的文本。

I have a large number of PDF files already created, and I would like to validate certain parts of the content with Regular Expressions (RegExs). I want to open the PDFs in C#, and be able to read out the text in something approaching a linear fashion.

如果页眉,页脚,任何侧边栏等,得到跳过或读出的顺序,也没关系。我只是后,就像我可以检索的主体文本。

If headers, footers, any sidebars, etc, get skipped or read out of order, it doesn't matter. I'm just after as much of the main-body text as I can retrieve.

你能指出我朝着工具,库的API等,这将使我以编程方式阅读文本的PDF文件?

Can you point me towards tools, libraries, API's, etc, that will enable me to programmatically read text in PDF files?

推荐答案

我用的 PDFSharp 不得迟于去年秋季为,并发现它很容易在比较给他人使用。对于 PDFSharp

I have used PDFSharp not later than last automn and found it very easy to use in comparison to others. Home page for PDFSharp.

这篇关于在C#中的PDF阅读程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆