如何在任何文档中搜索特定内容 [英] how to search particular content in any of document

查看:95
本文介绍了如何在任何文档中搜索特定内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大量文档上载了我的网页.我已经完成了名称和类型明智的搜索以使用获取特定文档.如何在内容内搜索文档并进行了明智的词性搜索

lot of documents upload my web page.already i done name and type wise search to using get particular document .how to search document inside content and word wise search

推荐答案

正在搜索任何文档中的内容?祝你好运.我曾在一家专门从事文档索引和归档的公司工作.在任何文档中可靠地搜索和查找文本字符串都不容易,因为没有规则规定某人如何格式化其特定类型的文档.

您可以尝试简单地浏览文件并查找匹配的文本字符串.为了减少扫描时间,您甚至可以尝试为常见的搜索参数建立索引.如果某些文档使用不常见的编码(例如IBM的EBCDIC)或使用某种压缩方式,则这种简单的方法将失败.

如果您想出一种解决方案,可以快速,可靠地搜索任何文档,那么您会很丰富.

如果我们可以将文档的类型缩小到尽可能少的话,事情会变得容易一些.尽管如此,您仍需要某种解析器才能导入每种类型并检查其文本内容.对于简单的.TXT文件,这很容易,但是大多数文件格式要复杂得多(例如,.PDF).
Searching content in any document? Good luck. I have worked for a company that specialized in document indexing and archiving. Reliably searching and finding text strings in any document is not easy, as there are no rules how someone formats his specific type of document.

You might try to simply go through the files and look for matching text strings. To reduce the scanning time, you may even try to build an index for common search parameters. This simple approach will fail if some of the documents use uncommon encoding (like IBM''s EBCDIC) or use some kind of compression.

If you came up with a solution to quickly and reliably search through any document you would be rich.

Things become a little easier if we can narrow down the types of documents to as few as possible. Still, you would need some kind of parser to be able import each type and examine its text contents. In case of simple .TXT files this is easy, but most file formats are far more sophisticated (like .PDF for example).


这篇关于如何在任何文档中搜索特定内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆