更改子进程的偏移量 [英] Changing offset from child processes

查看:93
本文介绍了更改子进程的偏移量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个父进程,然后创建一些子进程以从同一文件读取.

Let say that I have a parent process, and then create some number of child processes in order to read from the same file.

  1. 从文件描述符读取每个进程时,其同级所有进程之间的偏移量是否已更改?

  1. when each process read from the file descriptor, is the offset been changed between all his sibling's processes?

如此,是否每个进程都可能读取一条唯一的行,或者在不同步应用程序的情况下,每个进程都将读取与他的兄弟姐妹相同的行?

and so, is it possible that each process will read a unique line, or that without synchronized the app , each process will read the same lines like his siblings?

id = fork();

if (id < 0)
    exit(EXIT_FAILURE);

if (pipe(fd) == -1)
    exit(EXIT_FAILURE);

switch (id) {
case 0:
    //child process
    readFromFile(filename);
    exit(0);
    break;
default:
    //Parent process doing something..
    break;
}

推荐答案

在POSIX系统上,子进程通过fork调用继承的文件描述符引用了 system-wide 表.这是Linux手册页中有关open(2)的相关报价:

On a POSIX system, file descriptors inherited by a child process through a fork call refer to the same file descriptor in a system-wide table. Here's a relevant quotation from the Linux manual page for open(2):

术语打开文件描述"是POSIX用来指代的文件. 系统范围的打开文件表中的条目... 复制文件描述符(使用dup(2)或类似文件)时, 重复是指与原始文件相同的打开文件描述 文件描述符,因此这两个文件描述符共享 文件偏移量和文件状态标志.这种共享也可能发生 进程之间:通过fork(2)创建的子进程继承 其父级文件描述符的重复项,以及这些重复项 请参考相同的打开文件描述.

The term open file description is the one used by POSIX to refer to the entries in the system-wide table of open files... When a file descriptor is duplicated (using dup(2) or similar), the duplicate refers to the same open file description as the original file descriptor, and the two file descriptors consequently share the file offset and file status flags. Such sharing can also occur between processes: a child process created via fork(2) inherits duplicates of its parent's file descriptors, and those duplicates refer to the same open file descriptions.

这意味着父级和子级在文件偏移量上共享相同的信息,而一次读入将更改所有其他进程看到的偏移量.如果进程并行读取而读取之间没有lseek,则不会有两个进程读取相同的数据.

This means that the parent and child share the same information on file offset, and reads in one will change the offset seen by all other processes. If processes read in parallel without lseeking between reads, no two processes will read the same data.

您可以在下面的测试程序中看到此操作,该程序打印命令行中给定文件的前20个字符. (如果不共享位置信息,它将打印前10个字符两次.)

You can see this in action in the following test program, which prints the first 20 characters of the file given in the command line. (If position information wasn't shared, it would print the first 10 characters twice).

#include <stdlib.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>
#include <sys/stat.h>

char buffer[256];

int
main(int argc, char ** argv)
{
    int fd = open(argv[1], O_RDONLY);
    fork();
    read(fd, buffer, 10);
    write(1, buffer, 10);
    return 0;
}

如何,这是一个巨大的但是",这仅适用于用于读取文件的低级系统调用接口:open(2)read(2)等.如果使用更高级别的缓冲接口(例如fgetsstdio.h中的其他功能)会变得很复杂.当进程分叉时,即使它们继承指向内核中单个系统范围的文件信息共享结构的文件描述符的副本,它们也继承用户空间缓冲信息的单独副本.由stdio.h调用使用,并且此缓冲信息包括其自身的偏移量(显然,还有缓冲区),这些偏移量在进程之间不同步.

HOWEVER, and this is a huge "however", this applies only to the low-level system call interface for reading files: open(2), read(2), etc. If you are using a higher-level buffered interface, like fgets and other functions in stdio.h, things get complicated. When the processes are forked, even though they inherit copies of file descriptors that point to single system-wide, shared structures of file information in the kernel, they also inherit separate copies of user-space buffering information that's used by stdio.h calls, and this buffering information includes its own offsets (and buffers, obviously), which aren't synchronized between processes.

这篇关于更改子进程的偏移量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆