在处理大型numpy数组时,Python随机降低到0%的CPU使用率,导致代码“挂断"? [英] Python randomly drops to 0% CPU usage, causing the code to "hang up", when handling large numpy arrays?
问题描述
我一直在运行一些代码,其中的一部分从二进制文件加载到大型1D numpy数组中,然后使用
I have been running some code, a part of which loads in a large 1D numpy array from a binary file, and then alters the array using the numpy.where() method.
以下是代码中执行的操作的示例:
Here is an example of the operations performed in the code:
import numpy as np
num = 2048
threshold = 0.5
with open(file, 'rb') as f:
arr = np.fromfile(f, dtype=np.float32, count=num**3)
arr *= threshold
arr = np.where(arr >= 1.0, 1.0, arr)
vol_avg = np.sum(arr)/(num**3)
# both arr and vol_avg needed later
我已经运行了很多次(在一台免费机器上,即没有其他抑制CPU或内存使用的问题),没有任何问题.但是最近我注意到,有时代码会挂起一段较长的时间,从而使运行时间延长了一个数量级.在这些情况下,我一直在监视%CPU和内存使用情况(使用gnome系统监视器),发现python的CPU使用率降至0%.
I have run this many times (on a free machine, i.e. no other inhibiting CPU or memory usage) with no issue. But recently I have noticed that sometimes the code hangs for an extended period of time, making the runtime an order of magnitude longer. On these occasions I have been monitoring %CPU and memory usage (using gnome system monitor), and found that python's CPU usage drops to 0%.
在上述操作之间使用基本打印进行调试,对于哪个操作导致暂停似乎是任意的(即open(),np.fromfile(),np.where()分别导致挂起)随机运行).好像我被随意节流一样,因为在其他跑步过程中,没有吊死.
Using basic prints in between the above operations to debug, it seems to be arbitrary as to which operation causes the pausing (i.e. open(), np.fromfile(), np.where() have each separately caused a hang on a random run). It is as if I am being throttled randomly, because on other runs there are no hangs.
I have considered things like garbage collection or this question, but I cannot see any obvious relation to my problem (for example keystrokes have no effect).
进一步说明:二进制文件为32GB,机器(运行Linux)具有256GB内存.我正在通过ssh会话远程运行此代码.
Further notes: the binary file is 32GB, the machine (running Linux) has 256GB memory. I am running this code remotely, via an ssh session.
这可能是偶然的,但是我已经注意到,如果在刚重启机器后运行代码,则不会挂断电话.似乎经过几次运行或至少在其他系统使用之后,它们才开始发生.
This may be incidental, but I have noticed that there are no hang ups if I run the code after the machine has just been rebooted. It seems they begin to happen after a couple of runs, or at least other usage of the system.
推荐答案
CPU使用率的下降与python或numpy无关,但确实是从共享磁盘读取的结果,而网络I/O才是真正的罪魁祸首.对于如此大的阵列,读入内存可能是一个主要瓶颈.
The drops in CPU usage were unrelated to python or numpy, but were indeed a result of reading from a shared disk, and network I/O was the real culprit. For such large arrays, reading into memory can be a major bottleneck.
这篇关于在处理大型numpy数组时,Python随机降低到0%的CPU使用率,导致代码“挂断"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!