为什么我在 python 中使用 ks test 时我的 p 值等于 0 而统计量等于 1? [英] Why did my p-value equals 0 and statistic equals 1 when I use ks test in python?

查看:124
本文介绍了为什么我在 python 中使用 ks test 时我的 p 值等于 0 而统计量等于 1?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

感谢任何先看的人.

我的代码是:

import numpy as np
from scipy.stats import kstest
data=[31001, 38502, 40842, 40852, 43007, 47228, 48320, 50500, 54545, 57437, 60126, 65556, 71215, 78460, 81299, 96851, 106472, 108398, 118495, 130832, 141678, 155703, 180689, 218032, 222238, 239553, 250895, 274025, 298231, 330228, 330910, 352058, 362993, 369690, 382487, 397270, 414179, 454013, 504993, 518475, 531767, 551032, 782483, 913658, 1432195, 1712510, 2726323, 2777535, 3996759, 13608152]
x=np.array(data)
test_sta=kstest(x, 'norm')
print(test_sta)

kstest 的结果是 KstestResult(statistic=1.0, pvalue=0.0).是代码有问题还是数据根本不正常?

The result of kstest is KstestResult(statistic=1.0, pvalue=0.0). Is there anything wrong with the code or the data is just not normal at all?

推荐答案

我以前没用过这个,但我想你是在测试你的数据是否是标准正态的(即均值=0,方差=1)

I've not used this before, but I think you're testing whether your data is standard-normal (i.e. mean=0, variance=1)

绘制直方图表明它更接近对数正态.因此,我会这样做:

plotting a histogram shows it to be much closer to a log-normal. I'd therefore do:

x = np.log(data)
x -= np.mean(x)
x /= np.std(x)
kstest(x, 'norm')

这给了我 0.095 的检验统计量和 0.75 的 p 值,确认我们不能拒绝它不是对数正态的.

which gives me a test statistic of 0.095 and a p-value of 0.75, confirming that we can't reject that it's not log-normal.

检查这类事情的一个好方法是生成一些随机数据(来自已知分布),然后看看测试会返回什么.例如:

a good way to check this sort of thing is to generate some random data (from a known distribution) and see what the test gives you back. for example:

kstest(np.random.normal(size=100), 'norm')

给我接近 1 的 p 值,同时:

gives me p-values near 1, while:

kstest(np.random.normal(loc=13, size=100), 'norm')

给我接近 0 的 p 值.

gives me p-values near 0.

对数正态分布只是意味着它在对数转换后呈正态分布.如果您真的想针对正态分布进行测试,则只需不对数据进行日志转换,例如:

a log-normal distribution just means that it's normally distributed after log transforming. if you really want to test against a normal distribution, you'd just not log transform the data, e.g:

x = np.array(data, dtype=float)
x -= np.mean(x)
x /= np.std(x)
kstest(x, 'norm')

这给了我 7e-7 的 p 值,表明我们可以可靠地拒绝它是正态分布的假设.

which gives me a p-value of 7e-7, indicating that we can reliably reject the hypothesis that it's normally distributed.

这篇关于为什么我在 python 中使用 ks test 时我的 p 值等于 0 而统计量等于 1?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆