嵌套列表中特定项目的总和 [英] Summation of specific items in a nested list

查看:86
本文介绍了嵌套列表中特定项目的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据文件,例如:

I have a data file such as:

1  123  something else
2  234  something else
3  500  something else
.
. 
.
1  891  something else
2  234  something else
3  567  something else 
.
.
.

我试图以以下文件结尾:

I am trying to end up with a file with:

1 1014
2  468
3 1067

也就是说,如果第1列中的数字相同,则在第2列(或其他一些列)中添加数字.我相信将这些列读入一个嵌套列表并从那里开始是可行的,但是我一直在为此而苦苦挣扎.我尝试的另一种方法是使用我感兴趣的条目创建一个新文件:

That is, add the numbers in column 2 (or some other column) if the number in column 1 is the same. I believe reading the columns into a nested list and proceeding from there is the way to go but I have been struggling with that. Another approach I tried was creating a new file with the entries I am interested in:

for next in f.readlines():
    output.write(next[0:1] + "," + next[3:6]+ "\n")
    if not next:
        break

with open(output,"r") as file:
    data_list=[[int(x) for x in line.split(",")] for line in file]

print data_list

这将返回

[[1, 123], [2, 234], [3, 500], [1, 891], [2, 234], [3, 567]]

我想我可以遍历该列表并比较data_list [x] [0]并添加值(如果它们匹配),但这似乎不是一个很好的解决方案. 有人可以建议一种更优雅的方式吗? 特别是,我一直在努力求和最终得到的嵌套列表中特定项目的总和.

I guess I could loop through that list and compare data_list[x][0] and add the values if they match but that does not seem like an elegant solution. Could anyone suggest a more elegant way of doing this? Especially, I have been struggling with summation of specific items in the nested list I end up with.

推荐答案

使用字典来跟踪总和;使用 collections.defaultdict() 可以使启动密钥更加容易如果以前没有看到过,则为0:

Use a dictionary to track the sum; using a collections.defaultdict() makes it a little easier to start keys at 0 if they haven't been seen before:

from collections import defaultdict

sums = defaultdict(int)

with open(filename) as f:
    for line in f:
        col1, col2, rest = line.split(None, 2)
        sums[col1] += int(col2)

这将读取您的初始文件,将空格分隔两次,以取出前两列,然后根据第一列将第二列相加:

This reads your initial file, splits the line on whitespace 2 times to get the first two columns out, then sums the second column based on the first:

>>> from collections import defaultdict
>>> sample = '''\
... 1  123  something else
... 2  234  something else
... 3  500  something else
... 1  891  something else
... 2  234  something else
... 3  567  something else 
... '''.splitlines()
>>> sums = defaultdict(int)
>>> for line in sample:
...     col1, col2, rest = line.split(None, 2)
...     sums[col1] += int(col2)
... 
>>> sums
defaultdict(<type 'int'>, {'1': 1014, '3': 1067, '2': 468})

这篇关于嵌套列表中特定项目的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆