如何在叉写入时复制的工作()? [英] How does copy-on-write work in fork()?

查看:130
本文介绍了如何在叉写入时复制的工作()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道写入时复制如何发生派生()。

假设我们有一个有动力int数组过程的:

 为int *阵列=的malloc(1000000 *的sizeof(INT));

在数组元素被初始化为一些有意义的值。
然后,我们使用fork()来创建一个子进程,即B.
B就遍历数组,并做一些计算:

 的(一个数组){
    A = A + 1;
}


  1. 我知道B就不会整个数组复制立即,但是什么时候孩子B中数组分配内存?叉期间()?

  2. 是否所有分配整个数组一次,或只为单个整数 A = A + 1

  3. A = A + 1; 怎么会出现这种情况?请问体B读出从数据和新的数据写入到它自己的阵列?

我写了一些code探索COW是如何工作的。我的环境:Ubuntu的14.04,gcc4.8.2

 的#include<&stdlib.h中GT;
#包括LT&;&stdio.h中GT;
#包括LT&; SYS / sysinfo.h>无效printMemStat(){
    结构SYSINFO SI;
    SYSINFO(安培; SI);
    的printf(=== \\ n);
    的printf(总:%LLU \\ n,si.totalram);
    的printf(自由:%LLU \\ n,si.freeram);
}诠释主(){
    长LEN = 2亿;
    长*阵列=的malloc(LEN *的sizeof(长));
    长I = 0;
    对于(; I< LEN,我++){
        数组[我] =我;
    }    printMemStat();
    如果(叉()== 0){
        /*儿童*/
        printMemStat();        I = 0;
        对于(; I< LEN / 2;我++){
            数组[我] = I + 1;
        }        printMemStat();        I = 0;
        对于(; I< LEN,我++){
            数组[我] = I + 1;
        }        printMemStat();    }其他{
        / *母公司* /
        诠释倍= 10;
        而(times--大于0){
            睡眠(1);
        }
    }
    返回0;
}

叉(),子进程修改后的数字的在阵列的一半,然后修改整个阵列。输出是:

  ===
总计:16694571008
免费电话:2129162240
===
总计:16694571008
免费电话:2126106624
===
总计:16694571008
免费电话:1325101056
===
总计:16694571008
免费:533794816

看来这个阵列未分配的全过程。如果我稍微第一修改阶段更改为:

  I = 0;
对于(; I< LEN / 2;我++){
    数组[我* 2]我+ 1 =;
}

输出将是:

  ===
总计:16694571008
免费电话:2129924096
===
总计:16694571008
免费电话:2126868480
===
总计:16694571008
免费:526987264
===
总计:16694571008
免费:526987264


解决方案

依赖于操作系统,硬件架构和libc中。但是,是与MMU的叉(2)将与写入时复制工作。它不仅会(分配和)复制一些系统结构和页表,但堆页实际却指向父的那些,直到写的。

在这个更多控制可以与克隆(2)行使电话。和的vfork(2) beeing一个特殊的变体,它不指望要使用的页面。这Exec之前通常使用()。

至于划分:malloc()函数有超过所请求的存储器块(地址和大小)和C变量的元信息是一个指针(都在进程存储器堆和栈)。这两个看起来一样的孩子(相同的值,因为相同的底层内存页面中这两个进程的地址空间中看到)。因此从一个C程序点阵列已经被分配并且当处理开始存在的变量进行初始化。然而,底层的内存页都指向父进程的原始物理的人,所以不需要额外的内存页,直到他们被修改。

这也意味着物理内存的malloc可能用完之后()。 (这是坏的程序无法检查的错误返回code一个运行在随机code线)。有些操作系统不会允许这种形式的过量的:因此,如果你创建一个进程将不会分配的页面,但它需要它们可在那一刻(种保留它们),以防万一。在Linux中,这是配置,并呼吁过量会计

I want to know how copy-on-write happens in fork().

Assuming we have a process A that has a dynamical int array:

int *array = malloc(1000000*sizeof(int));

Elements in array are initialized to some meaningful values. Then, we use fork() to create a child process, namely B. B will iterate the array and do some calculations:

for(a in array){
    a = a+1;
}

  1. I know B will not copy the entire array immediately, but when does the child B allocate memory for array? during fork()?
  2. Does it allocate the entire array all at once, or only a single integer for a = a+1?
  3. a = a+1; how does this happen? Does B read data from A and write new data to its own array?

I wrote some code to explore how COW works. My environment: ubuntu 14.04, gcc4.8.2

#include <stdlib.h>
#include <stdio.h>
#include <sys/sysinfo.h>

void printMemStat(){
    struct sysinfo si;
    sysinfo(&si);
    printf("===\n");
    printf("Total: %llu\n", si.totalram);
    printf("Free: %llu\n", si.freeram);
}

int main(){
    long len = 200000000;
    long *array = malloc(len*sizeof(long));
    long i = 0;
    for(; i<len; i++){
        array[i] = i;
    }

    printMemStat();
    if(fork()==0){
        /*child*/
        printMemStat();

        i = 0;
        for(; i<len/2; i++){
            array[i] = i+1;
        }

        printMemStat();

        i = 0;
        for(; i<len; i++){
            array[i] = i+1;
        }

        printMemStat();

    }else{
        /*parent*/
        int times=10;
        while(times-- > 0){
            sleep(1);
        }
    }
    return 0;
}

After fork(), the child process modifies a half of numbers in array, and then modifies the entire array. The outputs are:

===
Total: 16694571008
Free: 2129162240
===
Total: 16694571008
Free: 2126106624
===
Total: 16694571008
Free: 1325101056
===
Total: 16694571008
Free: 533794816

It seems that the array is not allocated as a whole. If I slightly change the first modification phase to:

i = 0;
for(; i<len/2; i++){
    array[i*2] = i+1;
}

The outputs will be:

===
Total: 16694571008
Free: 2129924096
===
Total: 16694571008
Free: 2126868480
===
Total: 16694571008
Free: 526987264
===
Total: 16694571008
Free: 526987264

解决方案

Depends on the Operating System, hardware architecture and libc. But yes in case of recent Linux with MMU the fork(2) will work with copy-on-write. It will only (allocate and) copy a few system structures and the page table, but the heap pages actually point to the ones of the parent until written.

More control over this can be exercised with the clone(2) call. And vfork(2) beeing a special variant which does not expect the pages to be used. This is typically used before exec().

As for the allocation: the malloc() has meta information over requested memory blocks (address and size) and the C variable is a pointer (both in process memory heap and stacks). Those two look the same for the child (same values because same underlying memory page seen in the address space of both processes). So from a C program point of view the array is already allocated and the variable initialized when the process comes into existence. The underlying memory pages are however pointing to the original physical ones of the parent process, so no extra memory pages are needed until they are modified.

This also means that the physical memory might run out after malloc(). (Which is bad as the program cannot check the error return code of "a operation in a random code line"). Some operating systems will not allow this form of overcommit: So if you fork a process it will not allocate the pages, but it requires them to be available at that moment (kind of reserves them) just in case. In Linux this is configurable and called overcommit-accounting.

这篇关于如何在叉写入时复制的工作()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆