grep -f替代巨大的文件 [英] grep -f alternative for huge files
问题描述
grep -F -f file1 file2
file1是90 MB(250万行,每行一个字)
该命令doesn'实际上不管产生什么,无论我离开它有多久。显然,这超出了grep的范围。
似乎grep无法处理来自 -f
的许多查询,选项。但是,以下命令不会产生所需结果:
head file1> file3
grep -F -f file3 file2
我怀疑sed或awk是否会考虑到文件大小,也可以选择合适的替代方案。
我很遗憾替代品...请帮助。学习一些 sql
命令是否值得?这简单吗?任何人都可以指向正确的方向吗?
尝试使用LC_ALL = C。它将搜索模式从UTF-8变为ASCII,加速了原始速度的140倍。我有一个26G的文件,要花12个小时才能完成几分钟。
来源:抹去巨大的文件(80GB)以任何方式加快?
所以我做的是:
LC_ALL = C fgreppattern< input>输出
grep -F -f file1 file2
file1 is 90 Mb (2.5 million lines, one word per line)
file2 is 45 Gb
That command doesn't actually produce anything whatsoever, no matter how long I leave it running. Clearly, this is beyond grep's scope.
It seems grep can't handle that many queries from the -f
option. However, the following command does produce the desired result:
head file1 > file3
grep -F -f file3 file2
I have doubts about whether sed or awk would be appropriate alternatives either, given the file sizes.
I am at a loss for alternatives... please help. Is it worth it to learn some sql
commands? Is it easy? Can anyone point me in the right direction?
Try using LC_ALL=C . It turns the searching pattern from UTF-8 to ASCII which speeds up by 140 time the original speed. I have a 26G file which would take me around 12 hours to do down to a couple of minutes. Source: Grepping a huge file (80GB) any way to speed it up?
So what I do is:
LC_ALL=C fgrep "pattern" <input >output
这篇关于grep -f替代巨大的文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!