awk根据特定的列值完全删除重复的行 [英] awk to remove duplicate rows totally based on a particular column value

查看：192 发布时间：2020/7/8 11:22:31 sorting awk uniq

本文介绍了awk根据特定的列值完全删除重复的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个像这样的数据集:

I got a dataset like:

6   AA_A_56_30018678_E  0   30018678    P   A
6   SNP_A_30018678  0   30018678    A   G
6   SNP_A_30018679  0   30018679    T   G
6   SNP_A_30018682  0   30018682    T   G
6   SNP_A_30018695  0   30018695    G   C
6   AA_A_62_30018696_Q  0   30018696    P   A
6   AA_A_62_30018696_G  0   30018696    P   A
6   AA_A_62_30018696_R  0   30018696    P   A

如果第4列重复，我想删除所有行.

I want to remove all the rows if col 4 have duplicates.

我已经使用以下代码(使用sort，awk，uniq和join ...)来获取所需的输出，但是，有没有更好的方法呢?

I have use the below codes (using sort, awk,uniq and join...) to get the required output, however, is there a better way to do this?

sort -k4,4 example.txt | awk '{print $4}' | uniq -u  > snp_sort.txt

join -1 1 -2 4 snp_sort.txt example.txt | awk '{print $3,$5,$6,$1}' > uniq.txt

这是输出

SNP_A_30018679  T   G   30018679
SNP_A_30018682  T   G   30018682
SNP_A_30018695  G   C   30018695

推荐答案

使用命令替换:首先在第四字段中仅打印unique列，然后grep这些列.

Using command substitution: First print only unique columns in fourth field and then grep those columns.

grep "$(echo  "$(awk '{print $4}' inputfile.txt)" |sort |uniq -u)" inputfile.txt
6   SNP_A_30018679  0   30018679    T   G
6   SNP_A_30018682  0   30018682    T   G
6   SNP_A_30018695  0   30018695    G   C

注意:如果希望打印前四列，请在命令末尾添加awk '{NF=4}1'.当然，您可以通过更改$4和NF=4的值来更改列数.

Note: add awk '{NF=4}1' at the end of the command, if you wist to print first four columns. Of course you can change the number of columns by changing value of $4 and NF=4.

这篇关于awk根据特定的列值完全删除重复的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

awk根据特定的列值完全删除重复的行 [英] awk to remove duplicate rows totally based on a particular column value

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

awk根据特定的列值完全删除重复的行 [英] awk to remove duplicate rows totally based on a particular column value

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭