优雅地从主列表的子列表中获取信息 [英] Get information out of sub-lists in main list elegantly

查看:55
本文介绍了优雅地从主列表的子列表中获取信息的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,这是我的问题.我有一个由N子列表组成的列表,每个子列表都由M元素(浮点数)组成.因此,一般形式如下:

Ok, so here's my issue. I have a list composed of N sub-lists composed of M elements (floats) each. So in a general form it looks like this:

a_list = [b_list_1, b_list_2, ..., b_list_N]

具有:

b_list_i = [c_float_1, c_float_2, ..., c_float_M]

在此示例中,假设N=9 ; M=3,因此列表如下所示:

For this example assume N=9 ; M=3, so the list looks like this:

a = [[1.1, 0.5, 0.7], [0.3, 1.4, 0.2], [0.6, 0.2, 1.], [1.1, 0.5, 0.3], [0.2, 1.1, 0.8], [1.1, 0.5, 1.], [1.2, 0.3, 0.6], [0.6, 0.4, 0.9], [0.6, 0.2, 0.5]]

我需要遍历此列表,以标识与存储第三个浮点数均值的相同项目的共享相同的前两个浮点数的那些项.这意味着我应该检查一个项目之前是否已经被识别为重复项目,因此不再将其识别为新项目.

I need to loop through this list identifying those items that share the same first two floats as the same item where the third float should be averaged before storing. This means I should check if an item was already identified as being repeated previously, so I do not identify it again as a new item.

要更清楚地了解我的意思,这是处理列表a的输出应为:

To give a more clear idea of what I mean, this is what the output of processing list a should look like:

a_processed = [[1.1, 0.5, 0.67], [0.3, 1.4, 0.2], [0.6, 0.2, 0.75], [0.2, 1.1, 0.8], [1.2, 0.3, 0.6], [0.6, 0.4, 0.9]]

请注意,此新列表中的第一项在(a[0]a[3]a[5])中被识别了3次,因此将其第三次浮点平均((0.7+0.3+1.)/3. = 0.67)进行存储.第二个项目在a中未重复,因此按原样存储.第三项在a(a[2]a[8])中两次被发现,并与第三项float平均值((1.+0.5)/2.=0.75)一起存储.在a中没有找到新列表中的其余项,因此没有进行任何修改.

Note that the first item in this new list was identified three times in a (a[0], a[3] and a[5]) and so it was stored with its third float averaged ((0.7+0.3+1.)/3. = 0.67). The second item was not repeated in a so it was stored as is. The third item was found twice in a (a[2] and a[8]) and stored with its third float averaged ((1.+0.5)/2.=0.75). The rest of the items in the new list were not found as repeated in a so they were also stored with no modifications.

由于我知道不建议在循环浏览时更新/修改列表,因此我选择使用多个临时列表.这是我想出的代码:

Since I know updating/modifying a list while looping through it is not recommended, I opted to use several temporary lists. This is the code I came up with:

import numpy as np

a = [[1.1, 0.5, 0.7], [0.3, 1.4, 0.2], [0.6, 0.2, 1.], [1.1, 0.5, 0.3],
     [0.2, 1.1, 0.8], [1.1, 0.5, 1.], [1.2, 0.3, 0.6], [0.6, 0.4, 0.9],
[0.6, 0.2, 0.5]]

# Final list.
a_processed = []

# Holds indexes of elements to skip.
skip_elem = []

# Loop through all items in a.
for indx, elem in enumerate(a):
    temp_average = []
    temp_average.append(elem)        
    # Only process if not found previously.
    if indx not in skip_elem:
        for indx2, elem2 in enumerate(a[(indx+1):]):
            if elem[0] == elem2[0] and elem[1] == elem2[1]:
                temp_average.append(elem2)
                skip_elem.append(indx2+indx+1)

        # Store 1st and 2nd floats and averaged 3rd float.
        a_processed.append([temp_average[0][0], temp_average[0][1],
                            round(np.mean([i[2] for i in temp_average]),2)])

此代码有效,但我想知道是否可能存在更优雅/Python化的方法.它看起来太复杂了(我想说是Fortran式的).

This code works, but I'm wondering if there might be a more elegant/pythonic way of doing this. It just looks too convoluted (Fortran-esque I'd say) as is.

推荐答案

我认为您可以通过使用defaultdict创建一个从每个子列表中的前两个元素到全部的字典来使您的代码更简洁,更易于阅读第三项:

I think you can certainly make your code more concise and easier to read by using defaultdict to create a dictionary from the first two elements in each sublist to all the third items:

from collections import defaultdict
nums = defaultdict(list)
for arr in a:
    key = tuple(arr[:2]) # make the first two floats the key
    nums[key].append( arr[2] ) # append the third float for the given key

a_processed = [[k[0], k[1], sum(vals)/len(vals)] for k, vals in nums.items()]

使用它,我得到与您相同的输出(尽管顺序不同):

Using this, I get the same output as you (albeit in a different order):

[[0.2, 1.1, 0.8], [1.2, 0.3, 0.6], [0.3, 1.4, 0.2], [0.6, 0.4, 0.9], [1.1, 0.5, 0.6666666666666666], [0.6, 0.2, 0.75]]

如果a_processed的顺序有问题,则可以使用 OrderedDict ,如@DSM所指出.

If the order of a_processed is an issue, you can use an OrderedDict, as pointed out by @DSM.

这篇关于优雅地从主列表的子列表中获取信息的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆