在文本文件中搜索表达式 [英] Search for an expression in a text file
问题描述
我正在编写一段代码 -
1.有一个文本文件列表
2.并且需要在这些文件中搜索特定的表达式
(这是使用gcc 3.4.2在Linux上完成的)
目前,搜索是使用linux上的''grep''实用程序完成的。这个
花费了太多时间,并且杀死了应用程序的响应能力。
C中是否有任何功能或任何其他方式来完成这项工作
更快?
请注意我已经提供了操作系统和编译器信息,
只是为了它。这个小组中的一些人不喜欢这个,并且要求在适当的linux上重新发帖。我还是这个
因为我从来没有收到任何其他团体的好答案。
谢谢,
Ritesh
Hi,
I''m working on a piece of code that -
1. has a list of text files
2. and needs to search for a particular expression in these files
(this is being done on Linux using gcc 3.4.2)
Currently the search is done using the ''grep'' utility on linux. This
takes too much time, and kills the responsiveness of the application.
Is there any function in C or maybe any other way to get this job done
faster?
Please note that i''ve provided the operating system and compiler info,
just for the sake of it. Some people on this group don''t like this and
ask to re-post on an appropriate group for linux. I''ve still doen this
because I never recieve a good answer on any other group.
Thanks,
Ritesh
推荐答案
[你应该用中性的语气来读这个。成为一名优秀的程序员比编写代码更好
!
ritesh< riteshkap ... @ gmail.comwrote:
[You should read this with a neutral tone. There''s
more to becoming a good programmer than
writing code!]
ritesh <riteshkap...@gmail.comwrote:
我正在处理一段代码 -
1.有一个文本文件列表
2.并且需要搜索这些
文件中的特定表达式
Hi,
I''m working on a piece of code that -
1. has a list of text files
2. and needs to search for a particular expression in these
files
您对C语言有疑问吗?
Do you have a question about the C language?
(这是在Linux上使用gcc 3.4.2完成的)
目前搜索是使用
linux上的''grep''实用程序完成的。这花费了太多时间,并且杀死了应用程序的响应能力。
C中是否有任何功能或者任何其他方式来获取/ >
这项工作做得更快?
(this is being done on Linux using gcc 3.4.2)
Currently the search is done using the ''grep'' utility on
linux. This takes too much time, and kills the
responsiveness of the application.
Is there any function in C or maybe any other way to get
this job done faster?
大多数grep实现基于公开可用的
正则表达式库。除了strstr()
之外没有标准函数,它不太可能像taylored库那样优化。
您甚至可以尝试像[f]这样的工具lex生成词法分析器。
Most grep implementations are based on publically available
regex libraries. There is no standard function beyond strstr()
which is not likely to be as optimised as taylored libraries.
You may even try tools like [f]lex to generate lexical analysers.
请注意我已经提供了操作系统和
编译器信息,只是为了它。
Please note that i''ve provided the operating system and
compiler info, just for the sake of it.
提示:如果这些都是相关的,那么非常好的机会
你的clc主题帖子。
Hint: If these are at all relevant, there''s a very good chance
your post of off-topic in clc.
这个
组的一些人不喜欢这个并要求重新发布一个合适的
组为linux。
Some people on this
group don''t like this and ask to re-post on an appropriate
group for linux.
因为其他团队可以回答特定于平台的问题
好多了,clc并不特别感兴趣成为
还有另一个高噪音信号倾倒地点任何东西
以及main是有效标识符的所有东西。
Because other groups can answer platform specific questions
much better, and clc isn''t particularly interested in becoming
yet another high noise to signal dumping ground for anything
and everything where main is a valid identifier.
我已经还是这个
因为我从来没有收到任何其他团体的好答案。
I''ve still doen this
because I never recieve a good answer on any other group.
这并不意味着clc必须纠正你的问题。
您是否尝试过简单的网络/代码搜索?在grep或find / replace上寻找C
来源应该给你足够的
来源。
-
Peter
That doesn''t mean that clc has to rectify your problem.
Have you tried simple web/code searches? Looking for C
source on grep or find/replace should give you ample
sources.
--
Peter
我错过了两点 -
1.文本文件 - 属于随机形式 - 它们不包含记录或
任何有序的字符序列。
2.文本文件列表最多可达10K文件。所以我假设
使用C文件I / O打开每个文件不是一个很好的处理方式
这个。
Two Point I missed out -
1. The text files - are of random form - they don''t contain records or
any ordered sequence of characters.
2. The list of text files may go upto 10K files. So I''m assuming that
opening each file using the C File I/O is not a good way to handle
this.
" ritesh" < ri ********** @ gmail.comwrote in message
news:11 ******************** **@t38g2000prd.googlegr oups.com ...
"ritesh" <ri**********@gmail.comwrote in message
news:11**********************@t38g2000prd.googlegr oups.com...
两点我错过了 -
1.正文文件 - 是随机形式的 - 它们不包含记录或
任何有序的字符序列。
2.文本文件列表可能会出现高达10K的文件。所以我假设
使用C文件I / O打开每个文件不是一个很好的处理方式
这个。
Two Point I missed out -
1. The text files - are of random form - they don''t contain records or
any ordered sequence of characters.
2. The list of text files may go upto 10K files. So I''m assuming that
opening each file using the C File I/O is not a good way to handle
this.
< OT>
执行此操作的有效方法是索引文本文件。搜索
倒排索引在网上获取更多信息。更一般地说,您在信息检索领域的问题是b $ b。其中还有一个(不是非常活跃的b $ b)新闻组:comp.theory.info-retrieval。
< / OT>
-
Jonas
<OT>
The efficient way to do this would be to index your text files. Search for
"inverted index" on the web for more info. More generally, your question is
in the field of "information retrieval" for which there is also a (not very
active) newsgroup: comp.theory.info-retrieval.
</OT>
--
Jonas
这篇关于在文本文件中搜索表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!