Python中的日志计算 [英] Log-computations in Python

查看：111 发布时间：2020/4/29 3:23:14 python scipy large-data

本文介绍了Python中的日志计算的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在计算类似:

f(i)是一个函数，它为{1,2,...,5000}中的任何i返回[-1,1]中的实数.

Where f(i) is a function that returns a real number in [-1,1] for any i in {1,2,...,5000}.

很明显，总和的结果在[-1,1]中的某个位置，但是当我似乎无法使用直接编码在Python中进行计算时，由于0.5⁵⁰⁰⁰变为0并且comb(5000,2000) inf，导致计算出的总和变成NaN.

Obviously, the result of the sum is somewhere in [-1,1], but when I can't seem to be able to compute it in Python using straight forward coding, as 0.5⁵⁰⁰⁰ becomes 0 and comb(5000,2000) becomes inf, which result in the computed sum turning into NaN.

所需的解决方案是在两面都使用日志.

The required solution is to use log on both sides.

使用身份a × b = 2^{log(a) + log(b)}，即使我可以计算log(a)和log(b)，我也可以计算总和，即使a大而b几乎是0.

That is using the identity a × b = 2^{log(a) + log(b)}, if I could compute log(a) and log(b) I could compute the sum, even if a is big and b is almost 0.

所以我想我要问的是是否有一种简单的计算方法

log2(scipy.misc.comb(5000,2000))

所以我可以简单地通过计算

So I could compute my sum simply by

sum([2**(log2comb(5000,i)-5000) * f(i) for i in range(1,5000) ])

@abarnert的解决方案在为5000图形工作时，通过提高计算梳齿的精度解决了该问题.此方法适用于此示例，但无法扩展，因为例如，如果不是5000，我们有1e7，则所需的内存将大大增加.

@abarnert's solution, while working for the 5000 figure addresses the problem by increasing the precision in which the comb is computed. This works for this example, but doesn't scale, as the memory required would significantly increase if instead of 5000 we had 1e7 for example.

当前，我使用的是一种丑陋的解决方法，但保持较低的内存消耗:

Currently, I'm using a workaround which is ugly, but keeps memory consumption low:

log2(comb(5000,2000)) = sum([log2 (x) for x in 1:5000])-sum([log2 (x) for x in 1:2000])-sum([log2 (x) for x in 1:3000])

在可读的表达式中有这样做的方法吗?

推荐答案

总和

是f对具有二项式分布的期望>和p = 0.5.

您可以使用 scipy.stats.binom.expect :

import scipy.stats as stats

def f(i):
    return i
n, p = 5000, 0.5
print(stats.binom.expect(f, (n, p), lb=0, ub=n))
# 2499.99999997

还请注意，随着n变为无穷大且固定了p，二项式分布接近均值np和方差np*(1-p)的正态分布.因此，对于较大的n，您可以改为计算:

Also note that as n goes to infinity, with p fixed, the binomial distribution approaches the normal distribution with mean np and variance np*(1-p). Therefore, for large n you can instead compute:

import math
print(stats.norm.expect(f, loc=n*p, scale=math.sqrt((n*p*(1-p))), lb=0, ub=n))
# 2500.0

这篇关于Python中的日志计算的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Python中的日志计算 [英] Log-computations in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python中的日志计算 [英] Log-computations in Python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭