在python 3.7.2中导入numpy后的RAM使用情况 [英] RAM usage after importing numpy in python 3.7.2

查看:57
本文介绍了在python 3.7.2中导入numpy后的RAM使用情况的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 python 3.7.2 win32 运行 conda 4.6.3.在 python 中,当我导入 numpy 时,我看到 RAM 使用量增加了 80MB.由于我使用的是多处理,我想知道这是否正常,是否有办法避免这种 RAM 开销?请在下面查看相关软件包中的所有版本(来自 conda 列表):

I run conda 4.6.3 with python 3.7.2 win32. In python, when I import numpy, i see the RAM usage increase by 80MB. Since I am using multiprocessing, I wonder if this is normal and if there is anyway to avoid this RAM overhead? Please see below all the versions from relevant packages (from conda list):

蟒蛇..........3.7.2 h8c8aaf0_2

python...........3.7.2 h8c8aaf0_2

mkl_fft...........1.0.10 py37h14836fe_0

mkl_fft...........1.0.10 py37h14836fe_0

mkl_random..1.0.2 py37h343c172_0

mkl_random..1.0.2 py37h343c172_0

numpy.........1.15.4 py37h19fb1c0_0

numpy...........1.15.4 py37h19fb1c0_0

numpy-base..1.15.4 py37hc3f5095_0

numpy-base..1.15.4 py37hc3f5095_0

谢谢!

推荐答案

您无法避免这种成本,但它可能没有看起来那么糟糕.numpy 库(C only libopenblasp 的副本,加上所有 Python numpy 扩展模块)在磁盘上占用超过 60 MB,并且它们全部都将在导入时被内存映射到您的 Python 进程中;添加所有 Python 模块以及加载和初始化所有模块所涉及的动态分配内存,报告的 RAM 使用量增加 80 MB 是很正常的.

You can't avoid this cost, but it's likely not as bad as it seems. The numpy libraries (a copy of C only libopenblasp, plus all the Python numpy extension modules) occupy over 60 MB on disk, and they're all going to be memory mapped into your Python process on import; adding on all the Python modules and the dynamically allocated memory involved in loading and initializing all of them, and 80 MB of increased reported RAM usage is pretty normal.

说:

  1. C 库和 Python 扩展模块是内存映射的,但这并不意味着它们实际上占用了真实"的 RAM;如果未执行给定页面中的代码路径,则该页面将永远不会被加载,或者将在内存压力下被删除(甚至不会写入页面文件,因为它始终可以从原始 DLL 重新加载它).
  2. 在类 UNIX 系统上,当您 fork(multiprocessing 在除 Windows 之外的任何地方默认执行此操作)时,内存在复制时在父进程和工作进程之间共享-写模式.由于通常不会编写代码本身,唯一的成本是页表本身(它们引用的内存的一小部分),并且父子进程将共享该 RAM.
  1. The C libraries and Python extension modules are memory mapped in, but that doesn't actually mean they occupy "real" RAM; if the code paths in a given page aren't exercised, the page will either never be loaded, or will be dropped under memory pressure (not even written to the page file, since it can always reload it from the original DLL).
  2. On UNIX-like systems, when you fork (multiprocessing does this by default everywhere but Windows) that memory is shared between parent and worker processes in copy-on-write mode. Since the code itself is generally not written, the only cost is the page tables themselves (a tiny fraction of the memory they reference), and both parent and child will share that RAM.

遗憾的是,在 Windows 上,fork 不是一个选项(除非您在 Windows 上运行 Ubuntu bash,在这种情况下,它只是勉强 Windows,实际上是 Linux),因此您可能会在每个进程中支付更多的内存成本.但即便如此,支持 numpy 大部分内容的 C 库 libopenblasp 将在每个进程中重新映射,但操作系统应该正确地跨进程共享只读内存(和大部分(如果不是全部)Python 扩展模块也是如此).

Sadly, on Windows, fork isn't an option (unless you're running Ubuntu bash on Windows, in which case it's only barely Windows, effectively Linux), so you'll likely pay more of the memory costs in each process. But even there, libopenblasp, the C library backing large parts of numpy, will be remapped per process, but the OS should properly share that read-only memory across processes (and large parts, if not all, of the Python extension modules as well).

基本上,在这确实引起问题之前(而且不太可能发生),请不要担心.

Basically, until this actually causes a problem (and it's unlikely to do so), don't worry about it.

这篇关于在python 3.7.2中导入numpy后的RAM使用情况的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆