对应于所有唯一行的所有行的平均值 [英] average of all rows corresponing to all unique rows

查看:92
本文介绍了对应于所有唯一行的所有行的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有两列的numpy数组:

I have a numpy array with two columns:

A = [[1,1,1,2,3,1,2,3],[0.1,0.2,0.2,0.1,0.3,0.2,0.2,0.1]]

对于第一列中的所有唯一性,我希望获得与之对应的平均值.例如

for all uniques in first column, I want average of the values corresponding to it. For example

B = [[1,2,3], [0.175, 0.15, 0.2]]

有没有一种Python方式来做到这一点?

Is there a pythonic way to do this?

推荐答案

我认为以下是用于此类计算的标准numpy方法.如果A[0]的条目是小整数,则可以跳过对np.unique的调用,但这会使整个操作更加健壮并且独立于实际数据.

I think the following is the standard numpy approach for these kind of computations. The call to np.unique can be skipped if the entries of A[0] are small integers, but it makes the whole operation more robust and independent of the actual data.

>>> A = [[1,1,1,2,3,1,2,3],[0.1,0.2,0.2,0.1,0.3,0.2,0.2,0.1]]
>>> unq, unq_idx = np.unique(A[0], return_inverse=True)
>>> unq_sum = np.bincount(unq_idx, weights=A[1])
>>> unq_counts = np.bincount(unq_idx)
>>> unq_avg = unq_sum / unq_counts
>>> unq
array([1, 2, 3])
>>> unq_avg
array([ 0.175,  0.15 ,  0.2  ])

您当然可以堆叠两个数组,尽管这会将unq转换为float dtype:

You could of course then stack both arrays, although that will convert unq to float dtype:

>>> np.vstack((unq, unq_avg))
array([[ 1.   ,  2.   ,  3.   ],
       [ 0.175,  0.15 ,  0.2  ]])

这篇关于对应于所有唯一行的所有行的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆