如何在C#中以编程方式搜索PDF文档 [英] How to programmatically search a PDF document in c#
问题描述
我需要搜索pdf文件,以查看是否存在某个字符串.所讨论的字符串绝对被编码为文本(即,它不是图像或其他任何东西).我已经尝试过搜索文件,就好像它是纯文本一样,但这是行不通的.
I have a need to search a pdf file to see if a certain string is present. The string in question is definitely encoded as text (ie. it is not an image or anything). I have tried just searching the file as though it was plain text, but this does not work.
是否可以这样做? .net2.0是否有任何库可以为我从pdf文件中提取/解码所有文本?
Is it possible to do this? Are there any librarys out there for .net2.0 that will extract/decode all the text out of pdf file for me?
推荐答案
有一些可用的库. 查看 http://www.codeproject.com/KB/cs/PDFToText.aspx一个> 和 http://itextsharp.sourceforge.net/
There are a few libraries available out there. Check out http://www.codeproject.com/KB/cs/PDFToText.aspx and http://itextsharp.sourceforge.net/
需要一些努力,但有可能.
It takes a little bit of effort but it's possible.
这篇关于如何在C#中以编程方式搜索PDF文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!