文件读取时创建不需要的子进程 [英] Unwanted child processes being created while file reading

查看:20
本文介绍了文件读取时创建不需要的子进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一个多进程程序.当我尝试使用 if(f == 0) break; 在 for 循环中调用 fork() 时.我得到了所需数量的子进程.

I am creating a multi process program. When I tried to call fork() in a for loop using if(f == 0) break;. I got the desired number of child processes.

但是现在,我正在处理一个输入文件,并且最初不知道所需的进程数.这是我的代码的最小可能示例.

However now, I am dealing with an input file, and the desired number of processes is not known initially. Here is the smallest possible example of my code.

FILE* file = fopen("sample_input.txt", "r");
while(fscanf(file, "%d", &order) == 1){      
    f = fork();
    if(f == 0){
        break;
    } 
}

示例sample_input.txt:

5 2 8 1 4 2

现在正在创建数千个子进程(我想要 6 个,文件中的整数个数),可能是什么原因?是不是和文件指针有关?

Now thousands of child processes are being created (I want 6, the number of integers in the file), what could be the reason ? Is it something to do with the file pointer ?

我用控制台输出做了一些调试,子进程确实脱离了循环.然而,父母一遍又一遍地阅读一个小文件.如果我删除 fork(),循环会按预期执行 6 次.

I did some debugging with console outputs, the child processes are indeed breaking out of the loop. However the parent keeps reading a small file over and over. If I remove fork(), the loop executes 6 times as intended.

Edit2:我有一个理论,我无法证明它也许你可以帮助我.这可能是文件指针在进程之间共享的情况,当子进程退出时,它会关闭文件,而当父进程再次尝试读取时,它只是从头开始(或其他一些奇怪的行为).会不会是这样?

I have a theory, I can't prove it maybe you can help me. It could be the situation that the file pointer is shared between processes, when a child exits, it closes the file and when the parent tries to read again, it just starts from the beginning (or some other weird behavior). Could it be the case ?

推荐答案

当第一个进程读取第一个数字时,它实际上将整行读入内存.进程分叉.

When the first process reads the first number, it actually reads the whole line into memory. The process forks.

子进程打破循环;接下来发生的事情没有指定,但它可能会退出.父进程现在读取第二个数字并再次分叉.再次,孩子退出,父母读取第三个数字,叉子等.

The child process breaks the loop; what happens next is not specified, but it probably exits. The parent process now reads the second number and forks again. Again, the child exits and the parent reads the third number, forks, etc.

在第六个数字被读取并且第六个孩子退出后,父母去从文件中读取另一个缓冲区.在 Linux(或者更准确地说,使用 GNU C 库)上,您会得到一些奇怪的效果.请参阅 中的讨论为什么 fork 我的进程会导致文件被无限读取? 查看详细信息.但是,退出的子级将文件描述符的读取位置调整回开始位置,因此父级可以再次读取更多数据.

After the sixth number is read and the sixth child exits, the parent goes to read another buffer from the file. On Linux (or, more precisely, with the GNU C Library), you then get some weird effects. See the discussion in Why does forking my process cause the file to be read infinitely? to see the details. However, the children exiting adjust the read position of the file descriptor back to the start, so the parent can read more data again.

我对另一个问题的回答表明,如果子进程在退出之前关闭文件,则不会发生此行为.(无论如何它都不应该发生,但根据经验它确实发生了.)

My answer to the other question shows that if the child processes close the file before exiting, this behaviour does not occur. (It shouldn't occur anyway, but it does, empirically.)

GLIBC Bug 23151 - 一个未关闭文件的分叉进程之前 lseek退出并可能导致父 I/O 中的无限循环.

GLIBC Bug 23151 - A forked process with unclosed file does lseek before exit and can cause infinite loop in parent I/O.

该错误创建于 2019 年 5 月 8 日美国/太平洋地区,并于 2018 年 5 月 9 日以 INVALID 的形式关闭.给出的理由是:

The bug was created 2019-05-08 US/Pacific, and was closed as INVALID by 2018-05-09. The reason given was:

请阅读http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05_01,尤其是这一段:

Please read http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_05_01, especially this paragraph:

请注意,在 fork() 之后,存在两个句柄,而之前存在一个句柄.[…]

Note that after a fork(), two handles exist where one existed before. […]

请参阅 为什么 fork 我的进程会导致文件被无限读取? 对此进行广泛讨论.

Please see Why does forking my process cause the file to be read infinitely? for an extensive discussion of this.

这篇关于文件读取时创建不需要的子进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆