Python中的卡方检验 [英] Chi-Squared test in Python
问题描述
我在 R
中使用了以下代码来确定观察值(例如 20、20、0 和 0)与预期值/比率的拟合程度(四种情况中的每一种均为 25%,例如):
<代码>>chisq.test(c(20,20,0,0), p=c(0.25, 0.25, 0.25, 0.25))给定概率的卡方检验数据:c(20, 20, 0, 0)X 平方 = 40,df = 3,p 值 = 1.066e-08
如何在 Python 中复制它?我曾尝试使用 scipy
中的 chisquare
函数,但得到的结果却大不相同;我不确定这是否是正确的使用功能.我已经搜索了 scipy
文档,但它运行到 1000 多页时非常令人生畏;numpy
文档几乎比这多 50%.
scipy.stats.chisquare
期望观察到的和预期的绝对频率,而不是比率.你可以得到你想要的
虽然在期望值均匀分布在类上的情况下,您可以省略期望值的计算:
<预><代码>>>>卡方(观察)(40.0, 1.065509033425585e-08)第一个返回值是χ²统计量,第二个是检验的p值.
I've used the following code in R
to determine how well observed values (20, 20, 0 and 0 for example) fit expected values/ratios (25% for each of the four cases, for example):
> chisq.test(c(20,20,0,0), p=c(0.25, 0.25, 0.25, 0.25))
Chi-squared test for given probabilities
data: c(20, 20, 0, 0)
X-squared = 40, df = 3, p-value = 1.066e-08
How can I replicate this in Python? I've tried using the chisquare
function from scipy
but the results I obtained were very different; I'm not sure if this is even the correct function to use. I've searched through the scipy
documentation, but it's quite daunting as it runs to 1000+ pages; the numpy
documentation is almost 50% more than that.
scipy.stats.chisquare
expects observed and expected absolute frequencies, not ratios. You can obtain what you want with
>>> observed = np.array([20., 20., 0., 0.])
>>> expected = np.array([.25, .25, .25, .25]) * np.sum(observed)
>>> chisquare(observed, expected)
(40.0, 1.065509033425585e-08)
Although in the case that the expected values are uniformly distributed over the classes, you can leave out the computation of the expected values:
>>> chisquare(observed)
(40.0, 1.065509033425585e-08)
The first returned value is the χ² statistic, the second the p-value of the test.
这篇关于Python中的卡方检验的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!