numpy的阵列内存问题 [英] Numpy array memory issue

查看:170
本文介绍了numpy的阵列内存问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我相信我使用numpy的阵列具有存储器的问题。下面code正在运行几个小时就结束:

I believe I am having a memory issue using numpy arrays. The following code is being run for hours on end:

    new_data = npy.array([new_x, new_y1, new_y2, new_y3])
    private.data = npy.row_stack([private.data, new_data])

其中一个new_x,new_y1,new_y2,new_y3是浮动。

where new_x, new_y1, new_y2, new_y3 are floats.

约5小时录音此数据每秒(超过72000花车)后,程序无响应。我认为正在发生的是某种被淹没过程中的realloc和复制操作。有谁知道这是怎么回事?

After about 5 hours of recording this data every second (more than 72000 floats), the program becomes unresponsive. What I think is happening is some kind of realloc and copy operation that is swamping the process. Does anyone know if this is what is happening?

我需要一种方法来记录这个数据没有遇到这个问题,增速放缓。有没有办法知道,即使这大约数组的大小提前。它不一定需要使用numpy的阵列,但它需要类似的东西。有谁知道一个好的方法呢?

I need a way to record this data without encountering this slowdown issue. There is no way to know even approximately the size of this array beforehand. It does not necessarily need to use a numpy array, but it needs to be something similar. Does anyone know of a good method?

推荐答案

更新:我注册@ EOL出色的索引建议到答案

Update: I incorporated @EOL's excellent indexing suggestion into the answer.

问题可能是 row_stack 增长目标的方式。你可能会更好的处理自己的重新分配。下面code分配一个大的空数组,填充它,因为它同时填补了一个小时它的增长。

The problem might be the way row_stack grows the destination. You might be better off handling the reallocation yourself. The following code allocates a big empty array, fills it, and grows it as it fills an hour at a time

numcols = 4
growsize = 60*60 #60 samples/min * 60 min/hour
numrows = 3*growsize #3 hours, to start with
private.data = npy.zeros([numrows, numcols]) #alloc one big memory block
rowctr = 0
while (recording):
    private.data[rowctr] = npy.array([new_x, new_y1, new_y2, new_y3])
    rowctr += 1
    if (rowctr == numrows): #full, grow by another hour's worth of data
        private.data = npy.row_stack([private.data, npy.zeros([growsize, numcols])])
        numrows += growsize

这应该保持内存管理器从翻腾起伏太大。我想这与 row_stack 每个迭代和它跑了几个数量级的速度更快

This should keep the memory manager from thrashing around too much. I tried this versus row_stack on each iteration and it ran a couple of orders of magnitude faster.

这篇关于numpy的阵列内存问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆