在UNIX Shell脚本中按多个字段的唯一值排序 [英] Sorting by unique values of multiple fields in UNIX shell script

查看：154 发布时间：2020/7/21 3:58:08 c unix sorting unique field

本文介绍了在UNIX Shell脚本中按多个字段的唯一值排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是unix的新手，希望能够执行以下操作，但不确定如何操作.

I am new to unix and would like to be able to do the following but am unsure how.

以如下行作为文本文件:

Take a text file with lines like:

TR=P567;dir=o;day=su;TI=12:10;stn=westborough;Line=worcester
TR=P567;dir=o;day=su;TI=12:10;stn=westborough;Line=lowell
TR=P567;dir=o;day=su;TI=12:10;stn=westborough;Line=worcester
TR=P234;dir=o;day=su;TI=12:10;stn=westborough;Line=lowell
TR=P234;dir=o;day=su;TI=12:10;stn=westborough;Line=lowell
TR=P234;dir=o;day=su;TI=12:10;stn=westborough;Line=worcester

并输出:

TR=P567;dir=o;day=su;TI=12:10;stn=westborough;Line=worcester
TR=P567;dir=o;day=su;TI=12:10;stn=westborough;Line=lowell
TR=P234;dir=o;day=su;TI=12:10;stn=westborough;Line=lowell
TR=P234;dir=o;day=su;TI=12:10;stn=westborough;Line=worcester

我希望脚本能够找到具有唯一Line值的每个TR值的所有行.

I would like the script to be able to find all all the lines for each TR value that have a unique Line value.

谢谢

推荐答案

由于您显然没事.通过随机选择dir，day，TI和stn的值，您可以编写:

Since you are apparently O.K. with randomly choosing among the values for dir, day, TI, and stn, you can write:

sort -u -t ';' -k 1,1 -k 6,6 -s < input_file > output_file

说明:

sort实用程序对文本文件的行进行排序"，使您可以对文件中的行进行排序/比较/合并. (请参见 GNU Coreutils文档.)
-u或--unique选项仅输出等行程的第一个"，告诉sort，如果两条输入线相等，则只需要其中一条.
-k POS[,POS2]或--key=POS1[,POS2]选项，在POS1(起源1)处开始密钥，在POS2(缺省行尾)处结束"，告诉sort我们要在哪里密钥"排序方式.在我们的情况下，-k 1,1表示一个键由第一字段(从字段1到字段1)组成，而-k 6,6表示一个键由第六字段(从字段6到字段)组成6).
-t SEP或--field-separator=SEP选项告诉sort我们要使用SEP—在我们的例子中，';'—分隔和计数字段. (否则，它会认为字段由空格分隔，在我们的示例中，它将把整行视为单个字段.)
-s或--stabilize选项通过禁用最后查询比较来稳定排序"，告诉sort我们仅希望以指定的方式比较行;如果两行具有相同的上述键"，则即使它们在其他方面有所不同，也将它们视为等效的.由于我们使用的是-u，因此这意味着其中之一将被丢弃. (如果我们不使用-u，则仅表示sort不会相对于彼此重新排序.)

The sort utility, "sort lines of text files", lets you sort/compare/merge lines from files. (See the GNU Coreutils documentation.)
The -u or --unique option, "output only the first of an equal run", tells sort that if two input-lines are equal, then you only want one of them.
The -k POS[,POS2] or --key=POS1[,POS2] option, "start a key at POS1 (origin 1), end it at POS2 (default end of line)", tells sort where the "keys" are that we want to sort by. In our case, -k 1,1 means that one key consists of the first field (from field 1 through field 1), and -k 6,6 means that one key consists of the sixth field (from field 6 through field 6).
The -t SEP or --field-separator=SEP option tells sort that we want to use SEP — in our case, ';' — to separate and count fields. (Otherwise, it would think that fields are separated by whitespace, and in our case, it would treat the entire line as a single field.)
The -s or --stabilize option, "stabilize sort by disabling last-resort comparison", tells sort that we only want to compare lines in the way that we've specified; if two lines have the same above-defined "keys", then they're considered equivalent, even if they differ in other respects. Since we're using -u, that means that means that one of them will be discarded. (If we weren't using -u, it would just mean that sort wouldn't reorder them with respect to each other.)

这篇关于在UNIX Shell脚本中按多个字段的唯一值排序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在UNIX Shell脚本中按多个字段的唯一值排序 [英] Sorting by unique values of multiple fields in UNIX shell script

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

在UNIX Shell脚本中按多个字段的唯一值排序 [英] Sorting by unique values of multiple fields in UNIX shell script

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭