在文本文件中搜索表达式 [英] Search for an expression in a text file

查看:57
本文介绍了在文本文件中搜索表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述




我正在编写一段代码 -

1.有一个文本文件列表

2.并且需要在这些文件中搜索特定的表达式


(这是使用gcc 3.4.2在Linux上完成的)


目前,搜索是使用linux上的''grep''实用程序完成的。这个

花费了太多时间,并且杀死了应用程序的响应能力。


C中是否有任何功能或任何其他方式来完成这项工作

更快?


请注意我已经提供了操作系统和编译器信息,

只是为了它。这个小组中的一些人不喜欢这个,并且要求在适当的linux上重新发帖。我还是这个

因为我从来没有收到任何其他团体的好答案。


谢谢,

Ritesh

Hi,

I''m working on a piece of code that -
1. has a list of text files
2. and needs to search for a particular expression in these files

(this is being done on Linux using gcc 3.4.2)

Currently the search is done using the ''grep'' utility on linux. This
takes too much time, and kills the responsiveness of the application.

Is there any function in C or maybe any other way to get this job done
faster?

Please note that i''ve provided the operating system and compiler info,
just for the sake of it. Some people on this group don''t like this and
ask to re-post on an appropriate group for linux. I''ve still doen this
because I never recieve a good answer on any other group.

Thanks,
Ritesh

推荐答案

[你应该用中性的语气来读这个。成为一名优秀的程序员比编写代码更好




ritesh< riteshkap ... @ gmail.comwrote:
[You should read this with a neutral tone. There''s
more to becoming a good programmer than
writing code!]

ritesh <riteshkap...@gmail.comwrote:




我正在处理一段代码 -

1.有一个文本文件列表

2.并且需要搜索这些

文件中的特定表达式
Hi,

I''m working on a piece of code that -
1. has a list of text files
2. and needs to search for a particular expression in these
files



您对C语言有疑问吗?

Do you have a question about the C language?


(这是在Linux上使用gcc 3.4.2完成的)


目前搜索是使用

linux上的''grep''实用程序完成的。这花费了太多时间,并且杀死了应用程序的响应能力。


C中是否有任何功能或者任何其他方式来获取
这项工作做得更快?
(this is being done on Linux using gcc 3.4.2)

Currently the search is done using the ''grep'' utility on
linux. This takes too much time, and kills the
responsiveness of the application.

Is there any function in C or maybe any other way to get
this job done faster?



大多数grep实现基于公开可用的
正则表达式库。除了strstr()

之外没有标准函数,它不太可能像taylored库那样优化。


您甚至可以尝试像[f]这样的工具lex生成词法分析器。

Most grep implementations are based on publically available
regex libraries. There is no standard function beyond strstr()
which is not likely to be as optimised as taylored libraries.

You may even try tools like [f]lex to generate lexical analysers.


请注意我已经提供了操作系统和

编译器信息,只是为了它。
Please note that i''ve provided the operating system and
compiler info, just for the sake of it.



提示:如果这些都是相关的,那么非常好的机会

你的clc主题帖子。

Hint: If these are at all relevant, there''s a very good chance
your post of off-topic in clc.


这个

组的一些人不喜欢这个并要求重新发布一个合适的

组为linux。
Some people on this
group don''t like this and ask to re-post on an appropriate
group for linux.



因为其他团队可以回答特定于平台的问题

好​​多了,clc并不特别感兴趣成为

还有另一个高噪音信号倾倒地点任何东西

以及main是有效标识符的所有东西。

Because other groups can answer platform specific questions
much better, and clc isn''t particularly interested in becoming
yet another high noise to signal dumping ground for anything
and everything where main is a valid identifier.


我已经还是这个

因为我从来没有收到任何其他团体的好答案。
I''ve still doen this
because I never recieve a good answer on any other group.



这并不意味着clc必须纠正你的问题。


您是否尝试过简单的网络/代码搜索?在grep或find / replace上寻找C

来源应该给你足够的

来源。


-

Peter

That doesn''t mean that clc has to rectify your problem.

Have you tried simple web/code searches? Looking for C
source on grep or find/replace should give you ample
sources.

--
Peter


我错过了两点 -


1.文本文件 - 属于随机形式 - 它们不包含记录或

任何有序的字符序列。


2.文本文件列表最多可达10K文件。所以我假设

使用C文件I / O打开每个文件不是一个很好的处理方式

这个。

Two Point I missed out -

1. The text files - are of random form - they don''t contain records or
any ordered sequence of characters.

2. The list of text files may go upto 10K files. So I''m assuming that
opening each file using the C File I/O is not a good way to handle
this.




" ritesh" < ri ********** @ gmail.comwrote in message

news:11 ******************** **@t38g2000prd.googlegr oups.com ...

"ritesh" <ri**********@gmail.comwrote in message
news:11**********************@t38g2000prd.googlegr oups.com...

两点我错过了 -


1.正文文件 - 是随机形式的 - 它们不包含记录或

任何有序的字符序列。


2.文本文件列表可能会出现高达10K的文件。所以我假设

使用C文件I / O打开每个文件不是一个很好的处理方式

这个。
Two Point I missed out -

1. The text files - are of random form - they don''t contain records or
any ordered sequence of characters.

2. The list of text files may go upto 10K files. So I''m assuming that
opening each file using the C File I/O is not a good way to handle
this.



< OT>

执行此操作的有效方法是索引文本文件。搜索

倒排索引在网上获取更多信息。更一般地说,您在信息检索领域的问题是b $ b。其中还有一个(不是非常活跃的b $ b)新闻组:comp.theory.info-retrieval。

< / OT>


-

Jonas

<OT>
The efficient way to do this would be to index your text files. Search for
"inverted index" on the web for more info. More generally, your question is
in the field of "information retrieval" for which there is also a (not very
active) newsgroup: comp.theory.info-retrieval.
</OT>

--
Jonas


这篇关于在文本文件中搜索表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆