用awk对重复的行值求和 [英] Sum duplicate row values with awk

查看:608
本文介绍了用awk对重复的行值求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个具有以下结构的文件:

I have a file with the following structure:

1486113768 3656
1486113768 6280
1486113769 530912
1486113769 5629824
1486113770 5122176
1486113772 3565920
1486113772 530912
1486113773 9229920
1486113774 4020960
1486113774 4547928

我的目标是消除第一列中的重复值,求和第二列中的值,并用新的列值更新行:上面输入中的有效输出为:

My goal is to get rid of duplicate values in the first columns, sum the values in the second columns and update the row with new columns value: a working output, from the input above, would be:

1486113768 9936      # 3656 + 6280
1486113769 6160736   # 530912 + 5629824
1486113770 5122176   # ...
1486113772 4096832
1486113773 9229920
1486113774 8568888

我知道cutuniq:到目前为止,我设法通过以下方式在第一列中找到重复的值:

I know cut, uniq: until now I managed to find the duplicate values in first columns with:

cut -d " " -f 1 file.log | uniq -d

1486113768
1486113769
1486113772
1486113774

是否有一种笨拙的方式"实现我的目标?我知道这是一个非常强大且简洁的工具:我早些时候在

Is there a "awk way" to achieve my goal? I know it is very powerful and terse tool: I used it earlier with

awk '{print $2 " " $3 >> $1".log"}' log.txt

扫描log.txt中的所有行,并创建一个以$ 1为名称的.log文件,并用$ 2和$ 3值填充它,所有这些都在一条bash行中(通过read循环到地狱!);有没有办法找到第一列重复项,求和第二列值,然后重写删除重复项并打印第二列结果的行?

to scan all rows in log.txt and create a .log file with $1 as name, and filling it with $2 and $3 values, all in one bash line (to hell with read loop!); is there a way to find first column duplicates, sum its second column values and rewrite the row removing the duplicates and printing the resulting sum of second column?

推荐答案

使用如下所示的Awk

awk '{ seen[$1] += $2 } END { for (i in seen) print i, seen[i] }' file1
1486113768 9936
1486113769 6160736
1486113770 5122176
1486113772 4096832
1486113773 9229920
1486113774 8568888

{seen[$1]+=$2}创建一个哈希图,将$1视为索引值,并且总和仅针对文件中$1中的那些唯一项递增.

{seen[$1]+=$2} creates a hash-map with the $1 being treated as the index value and the sum is incremented only for those unique items from $1 in the file.

这篇关于用awk对重复的行值求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆