Python Scipy中的两个样本的Kolmogorov-Smirnov测试 [英] Two-sample Kolmogorov-Smirnov Test in Python Scipy
问题描述
我不知道如何在Scipy中进行两样本KS测试.
I can't figure out how to do a Two-sample KS test in Scipy.
阅读文档后 scipy kstest
我可以看到如何测试分布与标准正态分布相同的地方
I can see how to test where a distribution is identical to standard normal distribution
from scipy.stats import kstest
import numpy as np
x = np.random.normal(0,1,1000)
test_stat = kstest(x, 'norm')
#>>> test_stat
#(0.021080234718821145, 0.76584491300591395)
这意味着在p值为0.76时,我们不能拒绝两个分布相同的零假设.
Which means that at p-value of 0.76 we can not reject the null hypothesis that the two distributions are identical.
但是,我想比较两个分布,看看是否可以拒绝它们相同的零假设,就像这样:
However, I want to compare two distributions and see if I can reject the null hypothesis that they are identical, something like:
from scipy.stats import kstest
import numpy as np
x = np.random.normal(0,1,1000)
z = np.random.normal(1.1,0.9, 1000)
并测试x和z是否相同
我尝试过幼稚:
test_stat = kstest(x, z)
,并出现以下错误:
TypeError: 'numpy.ndarray' object is not callable
有没有办法在Python中进行两样本KS测试?如果是这样,我该怎么办?
Is there a way to do a two-sample KS test in Python? If so, how should I do it?
先谢谢您
推荐答案
您正在使用一样本KS测试.您可能需要两个样本的测试 ks_2samp
:
You are using the one-sample KS test. You probably want the two-sample test ks_2samp
:
>>> from scipy.stats import ks_2samp
>>> import numpy as np
>>>
>>> np.random.seed(12345678)
>>> x = np.random.normal(0, 1, 1000)
>>> y = np.random.normal(0, 1, 1000)
>>> z = np.random.normal(1.1, 0.9, 1000)
>>>
>>> ks_2samp(x, y)
Ks_2sampResult(statistic=0.022999999999999909, pvalue=0.95189016804849647)
>>> ks_2samp(x, z)
Ks_2sampResult(statistic=0.41800000000000004, pvalue=3.7081494119242173e-77)
结果可以解释如下:
-
您可以将python给定的
statistic
值与 KS测试临界值表,具体取决于您的样本量.当statistic
值大于临界值时,两个分布是不同的.
You can either compare the
statistic
value given by python to the KS-test critical value table according to your sample size. Whenstatistic
value is higher than the critical value, the two distributions are different.
或者您可以将p-value
与显着性水平 a 进行比较,通常a = 0.05或0.01(您确定,a越低,显着性越高).如果p值小于 a ,则这两个分布很可能是不同的.
Or you can compare the p-value
to a level of significance a, usually a=0.05 or 0.01 (you decide, the lower a is, the more significant). If p-value is lower than a, then it is very probable that the two distributions are different.
这篇关于Python Scipy中的两个样本的Kolmogorov-Smirnov测试的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!