计算大文件中的行数 [英] Count lines in large files
问题描述
我通常使用大约20 Gb大小的文本文件,并且发现自己经常对给定文件中的行数进行计数.
I commonly work with text files of ~20 Gb size and I find myself counting the number of lines in a given file very often.
我现在的操作方式只是cat fname | wc -l
,而且需要很长时间.有什么解决方案会更快吗?
The way I do it now it's just cat fname | wc -l
, and it takes very long. Is there any solution that'd be much faster?
我在安装了Hadoop的高性能集群中工作.我想知道地图缩小方法是否能帮上忙.
I work in a high performance cluster with Hadoop installed. I was wondering if a map reduce approach could help.
我希望解决方案像wc -l
解决方案一样简单,只需一行即可,但不确定其可行性.
I'd like the solution to be as simple as one line run, like the wc -l
solution, but not sure how feasible it is.
有什么想法吗?
推荐答案
尝试:sed -n '$=' filename
猫也是没有必要的:用您现在的方式,wc -l filename
就足够了.
Also cat is unnecessary: wc -l filename
is enough in your present way.
这篇关于计算大文件中的行数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!