使用 pearsonr 时遇到无效值 [英] Encountered invalid value when I use pearsonr

查看:15
本文介绍了使用 pearsonr 时遇到无效值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

也许我搞错了.如果是这样,我很抱歉问这个.

Maybe I made a mistake. If so, I am sorry to ask this.

我想计算Pearson 相关系数 使用 scipy 的 pearsonr 函数.

from scipy.stats.stats import pearsonr

X = [4, 4, 4, 4, 4, 4]
Y = [4, 5, 5, 4, 4, 4]

pearsonr(X, Y)

我在下面收到错误

RuntimeWarning: 在 double_scalars ### 中遇到无效值

RuntimeWarning: invalid value encountered in double_scalars ###

我得到错误的原因是 E[X] = 4 (Excepted Value of X is 4)

The reason why I get an error is E[X] = 4 (Excepted Value of X is 4)

我查看了scpy.stats.stats.py中pearsonr函数的代码.pearsonr 函数的部分内容如下.

I look at the code of pearsonr function in scpy.stats.stats.py. Some part of the pearsonr function is as follows.

mx = x.mean() # which is 4
my = y.mean() # not necessary
xm, ym = x-mx, y-my # xm = [0 0 0 0 0 0]
r_num = n*(np.add.reduce(xm*ym)) #r_num = 0, because xm*ym 1x6 Zero Vector.
r_den = n*np.sqrt(ss(xm)*ss(ym)) #r_den = 0
r = (r_num / r_den) # Invalid value encountered in double_scalars

最后pearsonr返回(nan, 1.0)

应该 pearsonr 返回 (0, 1.0) 吗?

我认为如果一个向量的每一行/列都有相同的值,协方差应该为零.因此根据PCC的定义,Pearson相关系数也应该为零.

I think if a vector has same value for every row/column, covariance should be zero. Thus Pearson's Correleation Coefficient should also be zero by the definition of PCC.

两个变量之间的皮尔逊相关系数定义为两个变量的协方差除以其标准差的乘积.

Pearson's correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations.

是bug还是我哪里出错了?

Is it bug or where do I make a mistake?

推荐答案

两个变量之间的皮尔逊相关系数定义为两个变量的协方差除以其标准差的乘积.

Pearson's correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations.

所以这是协方差

  • [4, 5, 5, 4, 4, 4]次的标准差
  • [4, 4, 4, 4, 4, 4] 的标准差.
  • the standard deviation of [4, 5, 5, 4, 4, 4] times
  • the standard deviation of [4, 4, 4, 4, 4, 4].

[4, 4, 4, 4, 4, 4] 的标准差为零.

所以这是协方差

  • [4, 5, 5, 4, 4, 4]次的标准差
  • 零.

所以这是协方差

  • 零.

被零除的任何东西都是nan.协方差的值无关.

Anything divided by zero is nan. The value of the covariance is irrelevant.

这篇关于使用 pearsonr 时遇到无效值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆