在C#中的PDF阅读程序 [英] Programmatic Reading of PDFs in C#
问题描述
我看到有关使用C#生成PDF文件的许多问题和答案。结果我有一个相关的,但不同的任务。
I see many questions and answers about using C# to generate PDF files.
I have a related, but different task.
我有大量的PDF文件已经创建,我想验证使用正则表达式(RegExs)内容的某些部分。我想开在C#中的PDF文件,并能够读出东西接近线性方式的文本。
I have a large number of PDF files already created, and I would like to validate certain parts of the content with Regular Expressions (RegExs). I want to open the PDFs in C#, and be able to read out the text in something approaching a linear fashion.
如果页眉,页脚,任何侧边栏等,得到跳过或读出的顺序,也没关系。我只是后,就像我可以检索的主体文本。
If headers, footers, any sidebars, etc, get skipped or read out of order, it doesn't matter. I'm just after as much of the main-body text as I can retrieve.
你能指出我朝着工具,库的API等,这将使我以编程方式阅读文本的PDF文件?
Can you point me towards tools, libraries, API's, etc, that will enable me to programmatically read text in PDF files?
推荐答案
我用的 PDFSharp 不得迟于去年秋季为,并发现它很容易在比较给他人使用。对于 PDFSharp 。
I have used PDFSharp not later than last automn and found it very easy to use in comparison to others. Home page for PDFSharp.
这篇关于在C#中的PDF阅读程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!