大量计算方差 [英] Calculating variance with large numbers
问题描述
我还没有真正使用过方差计算,而且我也不知道会发生什么.其实我一点都不擅长数学.
I haven't really used variance calculation that much, and I don't know quite what to expect. Actually I'm not too good with math at all.
我有一个1000000个随机数值的数组,范围为0-10000.
I have a an array of 1000000 random numeric values in the range 0-10000.
数组可能会变得更大,所以我使用64位int求和.
The array could grow even larger, so I use 64 bit int for sum.
我试图找到有关如何计算方差的代码,但是我不知道我是否得到正确的输出.
I have tried to find code on how to calc variance, but I don't know if I get correct output.
平均值为4692,中位数为4533.我使用以下代码获得方差1483780.469308:
The mean is 4692 and median is 4533. I get variance 1483780.469308 using the following code:
// size is the element count, in this case 1000000
// value_sum is __int64
double p2 = pow( (double)(value_sum - (value_sum/size)), (double)2.0 );
double variance = sqrt( (double)(p2 / (size-1)) );
我得到一个合理的价值吗?
Am I getting a reasonable value?
计算有问题吗?
推荐答案
注意:看起来您好像在计算方差.
Note: It doesn't look like you're calculating the variance.
通过从每个元素中减去平均值并计算这些差异的加权总和来计算方差.
Variance is calculated by subtracting the mean from every element and calculating the weighted sum of these differences.
所以您需要做的是:
// Get mean
double mean = static_cast<double>(value_sum)/size;
// Calculate variance
double variance = 0;
for(int i = 0;i<size;++i)
{
variance += (MyArray[i]-mean)*(MyArray[i]-mean)/size;
}
// Display
cout<<variance;
请注意,这是样本方差,在基础分布未知时使用(因此我们假设是均匀分布).
Note that this is the sample variance, and is used when the underlying distribution is unknown (so we assume a uniform distribution).
此外,经过一番挖掘,我发现这并不是一个无偏估计. Wolfram Alpha 对此有话要说,但例如,当
Also, after some digging around, I found that this is not an unbiased estimator. Wolfram Alpha has something to say about this, but as an example, when MATLAB computes the variance, it returns the "bias-corrected sample variance".
偏差校正后的方差可以通过将每个元素除以size-1
来获得,或者:
The bias-corrected variance can be obtained by dividing by each element by size-1
, or:
//Please check that size > 1
variance += (MyArray[i]-mean)*(MyArray[i]-mean)/(size-1);
还要注意,mean
的值保持不变.
Also note that, the value of mean
remains the same.
这篇关于大量计算方差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!