最快的grep [英] Fastest possible grep
问题描述
我想知道是否有任何提示,使 的grep
尽可能快。我有文字的一个相当大的基本文件在最快的方式来搜索。我做了他们全部小写,这样我就可以摆脱 -i
选项。这使得搜索要快得多。
I'd like to know if there is any tip to make grep
as fast as possible. I have a rather large base of text files to search in the quickest possible way. I've made them all lowercase, so that I could get rid of -i
option. This makes the search much faster.
另外,我发现了 -F
和 -P
模式比默认的更快。我用的是前者,当搜索字符串不是常规的前pression(只是纯文本),如果正则表达式是参与了后者。
Also, I've found out that -F
and -P
modes are quicker than the default one. I use the former when the search string is not a regular expression (just plain text), the latter if regex is involved.
有没有人有在加快任何经验的grep
?也许从头开始与一些特殊的标志(我在Linux CentOS的)编译它,组织在一定的方式将文件或者以某种方式寻找并行?任何提示是AP preciated。谢谢你。
Does anyone have any experience in speeding up grep
? Maybe compile it from scratch with some particular flag (I'm on Linux CentOS), organize the files in a certain fashion or maybe make the search parallel in some way? Any tip is appreciated. Thanks.
推荐答案
GNU平行,其中包括< A HREF =http://www.gnu.org/software/parallel/man.html#example__parallel_grep>如何使用的grep
一个使用它的一个示例>:
Try with GNU parallel, which includes an example of how to use it with grep
:
的grep -r
通过递归目录里grep。在多核CPU GNU
平行
往往可以加快这
grep -r
greps recursively through directories. On multicore CPUs GNUparallel
can often speed this up.
find . -type f | parallel -k -j150% -n 1000 -m grep -H -n STRING {}
这将运行每个内核1.5工作,并给予1000参数的grep
。
This will run 1.5 job per core, and give 1000 arguments to grep
.
有关大文件,它可以与把它分解在几个块输入 - 管
和 - 块
参数:
For big files, it can split it the input in several chunks with the --pipe
and --block
arguments:
parallel --pipe --block 2M grep foo < bigfile
您可以在几个不同的机器通过SSH还运行(ssh-agent的需要,以避免密码):
You could also run it on several different machines through SSH (ssh-agent needed to avoid passwords):
parallel --pipe --sshlogin server.example.com,server2.example.net grep foo < bigfile
这篇关于最快的grep的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!