Python中的日志计算 [英] Log-computations in Python
问题描述
我正在计算类似:
f(i)
是一个函数,它为{1,2,...,5000}
中的任何i
返回[-1,1]
中的实数.
Where f(i)
is a function that returns a real number in [-1,1]
for any i
in {1,2,...,5000}
.
很明显,总和的结果在[-1,1]
中的某个位置,但是当我似乎无法使用直接编码在Python中进行计算时,由于0.55000
变为0
并且comb(5000,2000)
inf
,导致计算出的总和变成NaN
.
Obviously, the result of the sum is somewhere in [-1,1]
, but when I can't seem to be able to compute it in Python using straight forward coding, as 0.55000
becomes 0
and comb(5000,2000)
becomes inf
, which result in the computed sum turning into NaN
.
所需的解决方案是在两面都使用日志.
The required solution is to use log on both sides.
使用身份a × b = 2log(a) + log(b)
,即使我可以计算log(a)
和log(b)
,我也可以计算总和,即使a
大而b
几乎是0
.
That is using the identity a × b = 2log(a) + log(b)
, if I could compute log(a)
and log(b)
I could compute the sum, even if a
is big and b
is almost 0
.
所以我想我要问的是是否有一种简单的计算方法
log2(scipy.misc.comb(5000,2000))
所以我可以简单地通过计算
So I could compute my sum simply by
sum([2**(log2comb(5000,i)-5000) * f(i) for i in range(1,5000) ])
@abarnert的解决方案在为5000图形工作时,通过提高计算梳齿的精度解决了该问题.此方法适用于此示例,但无法扩展,因为例如,如果不是5000,我们有1e7,则所需的内存将大大增加.
@abarnert's solution, while working for the 5000 figure addresses the problem by increasing the precision in which the comb is computed. This works for this example, but doesn't scale, as the memory required would significantly increase if instead of 5000 we had 1e7 for example.
当前,我使用的是一种丑陋的解决方法,但保持较低的内存消耗:
Currently, I'm using a workaround which is ugly, but keeps memory consumption low:
log2(comb(5000,2000)) = sum([log2 (x) for x in 1:5000])-sum([log2 (x) for x in 1:2000])-sum([log2 (x) for x in 1:3000])
在可读的表达式中有这样做的方法吗?
推荐答案
总和
是f
对具有p = 0.5
.
您可以使用 scipy.stats.binom.expect :
import scipy.stats as stats
def f(i):
return i
n, p = 5000, 0.5
print(stats.binom.expect(f, (n, p), lb=0, ub=n))
# 2499.99999997
还请注意,随着n
变为无穷大且固定了p
,二项式分布接近均值np
和方差np*(1-p)
的正态分布.因此,对于较大的n
,您可以改为计算:
Also note that as n
goes to infinity, with p
fixed, the binomial distribution approaches the normal distribution with mean np
and variance np*(1-p)
. Therefore, for large n
you can instead compute:
import math
print(stats.norm.expect(f, loc=n*p, scale=math.sqrt((n*p*(1-p))), lb=0, ub=n))
# 2500.0
这篇关于Python中的日志计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!