如何删除基于列值的重复行? [英] How to delete duplicated rows based in a column value?
问题描述
给出下表
123456.451 entered-auto_attendant
123456.451 duration:76 real:76
139651.526 entered-auto_attendant
139651.526 duration:62 real:62`
139382.537 entered-auto_attendant
使用基于Linux的bash shell脚本,我想基于第1列(带有长数字的列)的值删除所有行.考虑到这个数字是一个可变数字
Using a bash shell script based in Linux, I'd like to delete all the rows based on the value of column 1 (The one with the long number). Having into consideration that this number is a variable number
我尝试过
awk '{a[$3]++}!(a[$3]-1)' file
sort -u | uniq
但是我没有得到像这样的结果,在第一列的所有值之间进行比较,删除所有重复项并显示出来
But I am not getting the result which would be something like this, making a comparison between all the values of the first column, delete all the duplicates and show it
123456.451 entered-auto_attendant
139651.526 entered-auto_attendant
139382.537 entered-auto_attendant
推荐答案
您没有给出预期的输出,这对您有用吗?
you didn't give an expected output, does this work for you?
awk '!a[$1]++' file
与您的数据一起,输出为:
with your data, the output is:
123456.451 entered-auto_attendant
139651.526 entered-auto_attendant
139382.537 entered-auto_attendant
,此行仅打印唯一的column1行:
and this line prints only unique column1 line:
awk '{a[$1]++;b[$1]=$0}END{for(x in a)if(a[x]==1)print b[x]}' file
输出:
139382.537 entered-auto_attendant
这篇关于如何删除基于列值的重复行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!