os.fork()是在写时使用副本还是在Python中做父进程的完整副本? [英] Will os.fork() use copy on write or do a full copy of the parent-process in Python?

查看:68
本文介绍了os.fork()是在写时使用副本还是在Python中做父进程的完整副本?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将一个相当大的数据结构加载到一个进程中,然后分叉,以减少总的内存消耗. os.fork将以这种方式工作还是复制Linux(RHEL)中的所有父进程?

I would like to load a rather large data structure into a process and then fork in the hope to reduce total memory consumption. Will os.fork work that way or copy all of the parent process in Linux (RHEL)?

推荐答案

即使使用COW,CPython也会使用引用计数并将引用计数存储在每个对象的标头中.因此,除非您对该数据不做任何事情,否则您将很快对相关内存进行虚假写入,这将迫使系统复制数据.传递给函数?这是另一个参考,INCREF,是对COW内存的写操作.将其存储在变量或对象属性中?相同的.甚至只是查找一个方法就可以了吗?同上. 出于各种原因,一些内置数据结构将大量数据与对象(例如大多数集合)分开分配.如果这些内容最终出现在不同的页面上(或COW所适用的粒度),那么您可能会很幸运.但是,不能从此类集合中引用的对象被豁免-使用该对象以相同的方式操纵其refcount.

Even if COW is employed, CPython uses reference counting and stores the reference count in each object's header. So unless you don't do anything with that data, you'll quickly have spurious writes to the memory in question, which will force the system to copy the data. Pass it to a function? That's another reference, an INCREF, a write to the COW'd memory. Store it in a variable or object attribute? Same. Even just look up a method on it? Ditto. Some builtin data structures allocate the bulk of their data separately from the object (e.g. most collections) for various reasons. If these end up on a different page -- or whatever granularity COW works on -- you may get lucky with those. However, an object referenced from such a collection is not exempt -- using it manipulates its refcount just the same.

此外,由于没有按设计对其进行写操作(例如,本机CPython代码),并且您的fork处理过的某些对象可能不会触及,因此将共享少量数据.

可以共享(坦率地说,我不确定;我认为循环GC不会写入该对象).但是实际上可以保证Python代码使用的Python对象可以被写入.尽管我无法保证所有可能的配置,但类似的推理也适用于PyPy,Jython,IronPython等(只是它们在对象标头中摆弄位而不是进行引用计数).

In addition, a bit of data will be shared because there are no writes to it by design (e.g., the native CPython code), and some objects your fork'd process does not touch may be shared (I'm honestly not sure; I think the cycle GC does not write to the object). But Python objects used by Python code is virtually guaranteed to get written to. Similar reasoning applies to PyPy, Jython, IronPython, etc. (only that they fiddle with bits in the object header instead of doing reference counting) though I can't vouch for all possible configurations.

这篇关于os.fork()是在写时使用副本还是在Python中做父进程的完整副本?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆