使用scipy.minimze的Python 2.7内存泄漏 [英] Python 2.7 memory leak with scipy.minimze

查看：125 发布时间：2020/5/8 21:11:47 python numpy memory-leaks scipy out-of-memory

本文介绍了使用scipy.minimze的Python 2.7内存泄漏的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

在合适的过程中，我的RAM内存缓慢但稳定地增加(每两秒钟约2.8 mb)，直到出现内存错误或终止程序.当我尝试通过对模型进行拟合来拟合约80个测量值时，就会发生这种情况.通过使用scipy.minimze最小化Chi_squared来完成此拟合.

During a fit procedure, my RAM memory slowly but steadily (about 2.8 mb every couple of seconds) increases until I get a memory error or I terminate the program. This happens when I try to fit some 80 measurements by fitting a model to them. This fitting is done by using scipy.minimze to minimize Chi_squared.

到目前为止，我已经尝试过:

So far I've tried:

玩垃圾收集器，以在每次Chi_squared调用时进行收集我的模特没有帮助.
使用 global()查看所有变量，然后使用

Playing with the Garbage collector to collect every time Chi_squared calls my model, didn't help.

Looking at all variables using global() and then using pympler.asizeof to find the total amount of space my variables take up, this first increases but then stays constant.

The pympler.tracker.SummaryTracker also didn't show any increase of variable size.

通过这些测试，似乎我的RAM使用量增加了，而我的变量占用的总空间却是恒定的.我真的很想知道我的记忆力.

From these test, it seems that my RAM usage goes up while the total space my variables take up is constant. Where my memory goes I would really like to know.

下面的代码为我重现了问题:

The code below reproduces the problem for me:

import numpy as np
import scipy
import scipy.optimize as op
import scipy.stats
import scipy.integrate



def fit_model(model_pmt, x_list, y_list, PMT_parra, PMT_bounds=None, tolerance=10**-1, PMT_start_gues=None):
    result = op.minimize(chi_squared, PMT_start_gues, args=(x_list, y_list, model_pmt, PMT_parra[0], PMT_parra[1], PMT_parra[2]),
                     bounds=PMT_bounds, method='SLSQP', options={"ftol": tolerance})
    print result



def chi_squared(fit_parm, x, y_val, model, *non_fit_parm):
    parm = np.concatenate((fit_parm, non_fit_parm))
    y_mod = model(x, *parm)
    X2 = sum(pow(y_val - y_mod, 2))
    return X2



def basic_model(cb_list, max_intesity, sigma_e, noise, N, centre1, centre2, sigma_eb, min_dist=10**-5):
        """
        plateau function consisting of two gaussian CDF functions.
        """
        def get_distance(x, r):
            dist = abs(x - r)
            if dist < min_dist:
                dist = min_dist
            return dist

        def amount_of_material(x):
            A = scipy.stats.norm.cdf((x - centre1) / sigma_e)
            B = (1 - scipy.stats.norm.cdf((x - centre2) / sigma_e))
            cube =  A * B
            return cube

        def amount_of_field_INTEGRAL(x, cb):
        """Integral that is part of my sum"""
            result = scipy.integrate.quad(lambda r: scipy.stats.norm.pdf((r - cb) / sigma_b) / pow(get_distance(x, r), N),
                                          start, end, epsabs=10 ** -1)[0]
            return result



        # Set some constants, not important
        sigma_b = (sigma_eb**2-sigma_e**2)**0.5
        start, end = centre1 - 3 * sigma_e, centre2 + 3 * sigma_e
        integration_range = np.linspace(start, end, int(end - start) / 20)
        intensity_list = []

        # Doing a riemann sum, this is what takes the most time.
        for i, cb_point in enumerate(cb_list):
            intensity = sum([amount_of_material(x) * amount_of_field_INTEGRAL(x, cb_point) for x in integration_range])
            intensity *= (integration_range[1] - integration_range[0])
            intensity_list.append(intensity)


        model_values = np.array(intensity_list) / max(intensity_list)* max_intesity + noise
        return model_values


def get_dummy_data():
"""Can be ignored, produces something resembling my data with noise"""
    # X is just a range
    x_list = np.linspace(0, 300, 300)

    # Y is some sort of step function with noise
    A = scipy.stats.norm.cdf((x_list - 100) / 15.8)
    B = (1 - scipy.stats.norm.cdf((x_list - 200) / 15.8))
    y_list = A * B * .8 + .1 + np.random.normal(0, 0.05, 300)

    return x_list, y_list


if __name__=="__main__":
    # Set some variables
    start_pmt = [0.7, 8, 0.15, 0.6]
    pmt_bounds = [(.5, 1.3), (4, 15), (0.05, 0.3), (0.5, 3)]
    pmt_par = [110, 160, 15]
    x_list, y_list = get_dummy_data()

    fit_model(basic_model, x_list, y_list,  pmt_par, PMT_start_gues=start_pmt, PMT_bounds=pmt_bounds, tolerance=0.1)

感谢您的帮助！

推荐答案

我通过逐层删除间接层来缩小问题范围. (@ joris267，这是您真正应该做的事，然后再询问.)剩下的 minimum (最小)代码可以重现该问题，如下所示:

I narrowed down the problem by successively removing layer after layer of indirection. (@joris267 This is something you really should have done yourself before asking.) The minimal remaining code to reproduce the problem looks like this:

import scipy.integrate

if __name__=="__main__":    
    while True:
        scipy.integrate.quad(lambda r: 0, 1, 100)

结论:

是的，有内存泄漏.
不，泄漏不在scipy.minimize中，而是在scipy.quad中.

Yes, there is e memory leak.
No, the leak is not in scipy.minimize but in scipy.quad.

但是，这是scipy 0.19.0的已知问题.升级到0.19.1应该可以解决问题，但是我不确定，因为我本人仍然使用0.19.0:)

However, this is a known issue with scipy 0.19.0. Upgrade to 0.19.1 should supposedly fix the problem, but I don't know for sure because I'm still with 0.19.0 myself :)

更新:

将scipy升级到0.19.1(出于兼容性考虑，将numpy升级到1.13.3)后，该漏洞在我的系统中消失了.

After upgrading scipy to 0.19.1 (and numpy to 1.13.3 for compatibility) the leak disapeared on my system.

这篇关于使用scipy.minimze的Python 2.7内存泄漏的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

推荐答案

使用scipy.minimze的Python 2.7内存泄漏 [英] Python 2.7 memory leak with scipy.minimze

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用scipy.minimze的Python 2.7内存泄漏 [英] Python 2.7 memory leak with scipy.minimze

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭