c#在pdf文件中搜索 [英] c# searching in pdf files

查看:284
本文介绍了c#在pdf文件中搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我在一台服务器中有成千上万个pdf文件,我想开发一个内部Web应用程序来在每个文件中搜索一些单词....

一个星期前,我为此做了一个dll(使用itextsharp),但是要花很长时间才能收到响应(15分钟左右).

我该如何改善响应时间?
有什么主意吗?

谢谢

PS-实际上我有12000个pdf文件,并且我在sqlserver数据库中有每个文件的url.文件中的单词,以便您快速搜索.您可以通过扫描所有文件立即执行此操作,也可以通过在每次搜索过程中创建它来逐步构建它.不幸的是,第二种选择仍然意味着早期搜索仍然会非常缓慢.


hi
i have thousands of pdf files in one server and i want to develop and internal web application to search some words inside each file....

one week ago i did a dll for this (using itextsharp), but it take a long time to receive a response (15 minutes aprox.)

what can i do to improve the response time?
any idea?

thanks

PS - actually i have 12000 pdf files and i have the url of each file in a sqlserver database

解决方案

I would think that you need to create an index of all the words in your files so you can do speedy searches. You could do this immediately by scanning all files or build it up gradually by creating it as each search is processed. Unfortunately the second option will still mean that early searches will still be quite slow.


这篇关于c#在pdf文件中搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆