python中的numpy var()和statisticsvariant()有什么区别? [英] What is the difference between numpy var() and statistics variance() in python?
问题描述
我正在尝试一个Dataquest练习,但发现这两个软件包的差异是不同的.
I was trying one Dataquest exercise and I figured out that the variance I am getting is different for the two packages..
例如[1,2,3,4]
e.g for [1,2,3,4]
from statistics import variance
import numpy as np
print(np.var([1,2,3,4]))
print(variance([1,2,3,4]))
//1.25
//1.6666666666666667
该练习的预期答案是使用np.var()
The expected answer of the exercise is calculated with np.var()
修改 我想这必须要做的是,后一个是样本方差而不是方差.有人可以解释这个差异吗?
Edit I guess it has to do that the later one is sample variance and not variance.. Anyone could explain the difference?
推荐答案
使用此
print(np.var([1,2,3,4],ddof=1))
1.66666666667
Delta自由度:计算中使用的除数为N - ddof
,其中N表示元素数.默认情况下,ddof
为零.
Delta Degrees of Freedom: the divisor used in the calculation is N - ddof
, where N represents the number of elements. By default, ddof
is zero.
通常将平均值计算为x.sum() / N
,其中N = len(x)
.但是,如果指定了ddof
,则使用除数N - ddof
.
The mean is normally calculated as x.sum() / N
, where N = len(x)
. If, however, ddof
is specified, the divisor N - ddof
is used instead.
在标准的统计实践中,ddof=1
提供了一个假设的无限总体方差的无偏估计量. ddof=0
为正态分布变量提供方差的最大似然估计.
In standard statistical practice, ddof=1
provides an unbiased estimator of the variance of a hypothetical infinite population. ddof=0
provides a maximum likelihood estimate of the variance for normally distributed variables.
诸如numpy之类的统计库使用方差 n 来表示var或方差和标准差
Statistical libraries like numpy use the variance n for what they call var or variance and the standard deviation
有关更多信息,请参阅以下文档: numpy doc
For more information refer this documentation : numpy doc
这篇关于python中的numpy var()和statisticsvariant()有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!