grep -f替代巨大的文件 [英] grep -f alternative for huge files

查看：130 发布时间：2018/5/28 19:27:13 unix scripting grep large-files

本文介绍了grep -f替代巨大的文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

  grep -F -f file1 file2

file1是90 MB（250万行，每行一个字）

file2是45 Gb

该命令doesn'实际上不管产生什么，无论我离开它有多久。显然，这超出了grep的范围。

似乎grep无法处理来自 -f 的许多查询，选项。但是，以下命令不会产生所需结果：

  head file1> file3 
 grep -F -f file3 file2

我怀疑sed或awk是否会考虑到文件大小，也可以选择合适的替代方案。

我很遗憾替代品...请帮助。学习一些 sql 命令是否值得？这简单吗？任何人都可以指向正确的方向吗？
解决方案
尝试使用LC_ALL = C。它将搜索模式从UTF-8变为ASCII，加速了原始速度的140倍。我有一个26G的文件，要花12个小时才能完成几分钟。
来源：抹去巨大的文件（80GB）以任何方式加快？

所以我做的是：

LC_ALL = C fgreppattern< input>输出

grep -F -f file1 file2
file1 is 90 Mb (2.5 million lines, one word per line)

file2 is 45 Gb

That command doesn't actually produce anything whatsoever, no matter how long I leave it running. Clearly, this is beyond grep's scope.

It seems grep can't handle that many queries from the -f option. However, the following command does produce the desired result:
head file1 > file3 grep -F -f file3 file2
I have doubts about whether sed or awk would be appropriate alternatives either, given the file sizes.

I am at a loss for alternatives... please help. Is it worth it to learn some sql commands? Is it easy? Can anyone point me in the right direction?
解决方案
Try using LC_ALL=C . It turns the searching pattern from UTF-8 to ASCII which speeds up by 140 time the original speed. I have a 26G file which would take me around 12 hours to do down to a couple of minutes. Source: Grepping a huge file (80GB) any way to speed it up?

So what I do is:
LC_ALL=C fgrep "pattern" <input >output

这篇关于grep -f替代巨大的文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

grep -f替代巨大的文件 [英] grep -f alternative for huge files

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

grep -f替代巨大的文件 [英] grep -f alternative for huge files

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭