有效的方式来计算许多数字的几何平均 [英] Efficient way to compute geometric mean of many numbers
问题描述
我需要计算大量组数字,其值并不先验有限的几何平均值。天真的方法是
I need to compute the geometric mean of a large set of numbers, whose values are not a priori limited. The naive way would be
double geometric_mean(std::vector<double> const&data) // failure
{
auto product = 1.0;
for(auto x:data) product *= x;
return std::pow(product,1.0/data.size());
}
不过,这很可能是因为在积累了产品溢或溢出失败
(注:长双
没有按吨真的避免这个问题)。因此,下一个选项是总结式的对数:
However, this may well fail because of underflow or overflow in the accumulated product
(note: long double
doesn't really avoid this problem). So, the next option is to sum-up the logarithms:
double geometric_mean(std::vector<double> const&data)
{
auto sumlog = 0.0;
for(auto x:data) sum_log += std::log(x);
return std::exp(sum_log/data.size());
}
这工作,但电话的std ::日志()
每一个元素,它可能是缓慢的。 我能否避免?例如通过保留(相当于)指数,累计产品
分别?的尾数轨道
This works, but calls std::log()
for every element, which is potentially slow. Can I avoid that? For example by keeping track of (the equivalent of) the exponent and the mantissa of the accumulated product
separately?
推荐答案
分割指数和尾数的解决方案:
The "split exponent and mantissa" solution:
double geometric_mean(std::vector<double> const & data)
{
double m = 1.0;
long long ex = 0;
double invN = 1.0 / data.size();
for (double x : data)
{
int i;
double f1 = std::frexp(x,&i);
m*=f1;
ex+=i;
}
return std::pow( std::numeric_limits<double>::radix,ex * invN) * std::pow(m,invN);
}
如果你担心 前
可能溢出,你可以把它定义为一个双代替很长很长
,和 invN
步步倍增,但你可能会失去很多precision这种做法。
If you are concerned that ex
might overflow you can define it as a double instead of a long long
, and multiply by invN
at every step, but you might lose a lot of precision with this approach.
修改对于大的投入,我们可以分割计算在几个水桶:
EDIT For large inputs, we can split the computation in several buckets:
double geometric_mean(std::vector<double> const & data)
{
long long ex = 0;
auto do_bucket = [&data,&ex](int first,int last) -> double
{
double ans = 1.0;
for ( ;first != last;++first)
{
int i;
ans *= std::frexp(data[first],&i);
ex+=i;
}
return ans;
};
const int bucket_size = -std::log2( std::numeric_limits<double>::min() );
std::size_t buckets = data.size() / bucket_size;
double invN = 1.0 / data.size();
double m = 1.0;
for (std::size_t i = 0;i < buckets;++i)
m *= std::pow( do_bucket(i * bucket_size,(i+1) * bucket_size),invN );
m*= std::pow( do_bucket( buckets * bucket_size, data.size() ),invN );
return std::pow( std::numeric_limits<double>::radix,ex * invN ) * m;
}
这篇关于有效的方式来计算许多数字的几何平均的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!