Python Scipy 中的两个样本 Kolmogorov-Smirnov 检验 [英] Two-sample Kolmogorov-Smirnov Test in Python Scipy

查看:60
本文介绍了Python Scipy 中的两个样本 Kolmogorov-Smirnov 检验的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不知道如何在 Scipy 中进行两样本 KS 测试.

阅读文档后 scipy kstest

我可以看到如何测试分布与标准正态分布相同的地方

from scipy.stats import kstest将 numpy 导入为 npx = np.random.normal(0,1,1000)test_stat = kstest(x, 'norm')#>>>测试状态#(0.021080234718821145, 0.76584491300591395)

这意味着在 p 值为 0.76 时,我们不能拒绝两个分布相同的原假设.

但是,我想比较两个分布,看看我是否可以拒绝它们相同的原假设,例如:

from scipy.stats import kstest将 numpy 导入为 npx = np.random.normal(0,1,1000)z = np.random.normal(1.1,0.9, 1000)

并测试 x 和 z 是否相同

我尝试了天真:

test_stat = kstest(x, z)

并得到以下错误:

TypeError: 'numpy.ndarray' 对象不可调用

有没有办法在 Python 中进行两个样本的 KS 测试?如果是这样,我该怎么做?

提前致谢

解决方案

您正在使用单样本 KS 测试.您可能想要两个样本测试 ks_2samp:

<预><代码>>>>从 scipy.stats 导入 ks_2samp>>>将 numpy 导入为 np>>>>>>np.random.seed(12345678)>>>x = np.random.normal(0, 1, 1000)>>>y = np.random.normal(0, 1, 1000)>>>z = np.random.normal(1.1, 0.9, 1000)>>>>>>ks_2samp(x, y)Ks_2sampResult(统计=0.022999999999999909,pvalue=0.95189016804849647)>>>ks_2samp(x, z)Ks_2sampResult(统计=0.41800000000000004,pvalue=3.7081494119242173e-77)

结果可以解释如下:

  1. 您可以将 python 给出的 statistic 值与 KS-test 临界值表 根据您的样本量.当statistic值高于临界值时,两个分布不同.

  2. 或者您可以将 p 值 与显着性水平 a 进行比较,通常是 a=0.05 或 0.01(您决定,较低的 a 是,越重要).如果 p 值小于 a,那么两个分布很可能不同.

I can't figure out how to do a Two-sample KS test in Scipy.

After reading the documentation scipy kstest

I can see how to test where a distribution is identical to standard normal distribution

from scipy.stats import kstest
import numpy as np

x = np.random.normal(0,1,1000)
test_stat = kstest(x, 'norm')
#>>> test_stat
#(0.021080234718821145, 0.76584491300591395)

Which means that at p-value of 0.76 we can not reject the null hypothesis that the two distributions are identical.

However, I want to compare two distributions and see if I can reject the null hypothesis that they are identical, something like:

from scipy.stats import kstest
import numpy as np

x = np.random.normal(0,1,1000)
z = np.random.normal(1.1,0.9, 1000)

and test whether x and z are identical

I tried the naive:

test_stat = kstest(x, z)

and got the following error:

TypeError: 'numpy.ndarray' object is not callable

Is there a way to do a two-sample KS test in Python? If so, how should I do it?

Thank You in Advance

解决方案

You are using the one-sample KS test. You probably want the two-sample test ks_2samp:

>>> from scipy.stats import ks_2samp
>>> import numpy as np
>>> 
>>> np.random.seed(12345678)
>>> x = np.random.normal(0, 1, 1000)
>>> y = np.random.normal(0, 1, 1000)
>>> z = np.random.normal(1.1, 0.9, 1000)
>>> 
>>> ks_2samp(x, y)
Ks_2sampResult(statistic=0.022999999999999909, pvalue=0.95189016804849647)
>>> ks_2samp(x, z)
Ks_2sampResult(statistic=0.41800000000000004, pvalue=3.7081494119242173e-77)

Results can be interpreted as following:

  1. You can either compare the statistic value given by python to the KS-test critical value table according to your sample size. When statistic value is higher than the critical value, the two distributions are different.

  2. Or you can compare the p-value to a level of significance a, usually a=0.05 or 0.01 (you decide, the lower a is, the more significant). If p-value is lower than a, then it is very probable that the two distributions are different.

这篇关于Python Scipy 中的两个样本 Kolmogorov-Smirnov 检验的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆