scipy p 值返回 0.0 [英] scipy p-value returns 0.0

查看:77
本文介绍了scipy p 值返回 0.0的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 2 个样本 Kolmogorov Smirnov 检验,我得到的 p 值为 0.0.

>>>scipy.stats.ks_2samp(dataset1, dataset2)(0.65296076312083573, 0.0)

查看 2 个数据集的直方图,我非常确信它们代表了两个不同的数据集.但是,真的,p = 0.0?这似乎没有意义.不应该是一个很小但正数的数字吗?

我知道返回值的类型是 numpy.float64.跟这有关系吗?

数据在这里:https://www.dropbox.com/s/jpixhz0pcybyh1t/data4stack.csv

scipy.version.full_version'0.13.2'

解决方案

是的,概率非常:

<预><代码>>>>从 pprint 导入 pprint>>>pprint ([(i, scipy.stats.ks_2samp(dataset1, dataset2[:i])[1])... for i in range(200,len(dataset2),200)])[(200, 3.1281733251275881e-63),(400, 3.5780609056448825e-157),(600, 9.2884803664366062e-225),(800, 7.1429666685167604e-293),(1000, 0.0),(1200, 0.0),(1400, 0.0),(1600, 0.0),(1800, 0.0),(2000, 0.0),(2200, 0.0),(2400, 0.0)]

Using a 2 sample Kolmogorov Smirnov test, I am getting a p-value of 0.0.

>>>scipy.stats.ks_2samp(dataset1, dataset2)
(0.65296076312083573, 0.0)

Looking at the histograms of the 2 datasets, I am quite confident they represent two different datasets. But, really, p = 0.0? That doesn't seem to make sense. Shouldn't it be a very small but positive number?

I know the return value is of type numpy.float64. Does that have something to do with it?

EDIT: data here: https://www.dropbox.com/s/jpixhz0pcybyh1t/data4stack.csv

scipy.version.full_version
'0.13.2'

解决方案

Yes, the probability is very small:

>>> from pprint import pprint
>>> pprint ([(i, scipy.stats.ks_2samp(dataset1, dataset2[:i])[1]) 
...                for i in range(200,len(dataset2),200)])
[(200, 3.1281733251275881e-63),
 (400, 3.5780609056448825e-157),
 (600, 9.2884803664366062e-225),
 (800, 7.1429666685167604e-293),
 (1000, 0.0),
 (1200, 0.0),
 (1400, 0.0),
 (1600, 0.0),
 (1800, 0.0),
 (2000, 0.0),
 (2200, 0.0),
 (2400, 0.0)]

这篇关于scipy p 值返回 0.0的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆