一种有效的方式来计算的非零元素的每一列或行的平均 [英] An efficient way to calculate the mean of each column or row of non-zero elements
问题描述
我有用户对电影给出收视率numpy的数组。该评级是1和5之间,而0意味着用户不能在一个电影评论。我要计算的平均评分每部电影,平均等级每一个用户。换句话说,我将计算的非零元素的每一列或行的均值。
I have a numpy array for ratings given by users on movies. The rating is between 1 and 5, while 0 means that a user does not rate on a movie. I want to calculate the average rating of each movie, and the average rating of each user. In other words, I will calculate the mean of each column or row of non-zero elements.
是否有一个有效的numpy的数组函数来处理这种情况?我知道被列或行人工迭代收视率就可以解决问题。
Is there an efficient numpy array function to handle this case? I know manually iterating ratings by columns or rows can solve the problem.
在此先感谢!
推荐答案
由于值丢弃是0,则可以通过执行总和沿轴,然后通过非零元素的数目除以(沿手动计算的平均相同的轴):
Since the values to discard are 0, you can compute the mean manually by doing the sum along an axis and then dividing by the number of non zeros elements (along the same axis):
a = np.array([[8.,9,7,0], [0,0,5,6]])
a.sum(1)/(a != 0).sum(1)
结果:
array([ 8. , 5.5])
你可以看到,零点未在平均考虑。
as you can see, the zeros are not considered in the mean.
这篇关于一种有效的方式来计算的非零元素的每一列或行的平均的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!