调用uniq并在shell中按不同顺序排序 [英] calling uniq and sort in different orders in shell

查看：116 发布时间：2020/5/21 21:08:38 optimization shell performance sorting uniq

本文介绍了调用uniq并在shell中按不同顺序排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

在shell脚本中调用uniq和sort的顺序是否有所不同?我在这里谈论的是时间和空间.

Is there a difference in the order of uniq and sort when calling them in a shell script? I’m talking here about time- and space-wise.

grep 'somePattern' | uniq | sort

vs.

grep 'somePattern' | sort | uniq

对140万行文本文件的快速测试显示，第一种方法(获取uniq值然后进行排序)的速度略有提高(5.5 s与5.0 s)

a quick test on a 140 k lines textfile showed a slight speed improvement (5.5 s vs 5.0 s) for the first method (get uniq values and then sort)

我不知道如何测量内存使用量...

I don’t know how to measure memory usage though …

现在的问题是:顺序会有所不同吗?还是取决于grep返回的行(很多/很少重复)

The question now is: does the order make a difference? Or is it dependent on the returned lines from grep (many/few duplicates)

唯一正确的命令是在sort之后调用uniq，因为uniq的手册页说:

The only correct order is to call uniq after sort, since the man page for uniq says:

从INPUT(或标准输入)中丢弃连续的同一行中的所有行，但丢弃其中之一，写入OUTPUT(或标准输出).

Discard all but one of successive identical lines from INPUT (or standard input), writing to OUTPUT (or standard output).

因此应该是

grep 'somePattern' | sort | uniq

这篇关于调用uniq并在shell中按不同顺序排序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文