在 STDOUT 和 STDIN 的文件描述符上执行库函数的奇怪行为 [英] Strange behavior performing library functions on STDOUT and STDIN's file descriptors

查看:14
本文介绍了在 STDOUT 和 STDIN 的文件描述符上执行库函数的奇怪行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我作为 C 程序员的这些年里,我一直对标准流文件描述符感到困惑.某些地方,例如维基百科[1],说:

Throughout my years as a C programmer, I've always been confused about the standard stream file descriptors. Some places, like Wikipedia[1], say:

在 C 编程语言中,标准输入、输出和错误流分别附加到现有的 Unix 文件描述符 0、1 和 2.

In the C programming language, the standard input, output, and error streams are attached to the existing Unix file descriptors 0, 1 and 2 respectively.

这由 unistd.h:

/* Standard file descriptors.  */
#define STDIN_FILENO    0       /* Standard input.  */
#define STDOUT_FILENO   1       /* Standard output.  */
#define STDERR_FILENO   2       /* Standard error output.  */

但是,此代码(在任何系统上):

However, this code (on any system):

write(0, "Hello, World!
", 14);

将打印 Hello, World!(和一个换行符)到 STDOUT.这很奇怪,因为 STDOUT 的文件描述符应该是 1.write-ing 到文件描述符 1也打印到 STDOUT.

Will print Hello, World! (and a newline) to STDOUT. This is odd because STDOUT's file descriptor is supposed to be 1. write-ing to file descriptor 1 also prints to STDOUT.

对文件描述符 0 执行 ioctl 更改标准输入[2],并且在文件描述符 1 上更改标准输出.但是,在 0 或 1 更改标准上执行 termios 功能输入[3][4].

Performing an ioctl on file descriptor 0 changes standard input[2], and on file descriptor 1 changes standard output. However, performing termios functions on either 0 or 1 changes standard input[3][4].

我对文件描述符 1 和 0 的行为感到非常困惑.有谁知道为什么:

I'm very confused about the behavior of file descriptors 1 and 0. Does anyone know why:

  • write 1 或 0 写入标准输出?
  • 在 1 上执行 ioctl 修改标准输出,在 0 上修改标准输入,但是在 1 或 0 上执行 tcsetattr/tcgetattr 对标准输入?
  • writeing to 1 or 0 writes to standard output?
  • Performing ioctl on 1 modifies standard output and on 0 modifies standard input, but performing tcsetattr/tcgetattr on either 1 or 0 works for standard input?

推荐答案

让我们先回顾一些涉及的关键概念:

Let's start by reviewing some of the key concepts involved:

  • 文件描述

在操作系统内核中,每个文件、管道端点、套接字端点、打开设备节点等,都有一个文件描述.内核使用这些来跟踪文件中的位置、标志(读取、写入、追加、执行时关闭)、记录锁等.

In the operating system kernel, each file, pipe endpoint, socket endpoint, open device node, and so on, has a file description. The kernel uses these to keep track of the position in the file, the flags (read, write, append, close-on-exec), record locks, and so on.

文件描述是内核内部的,不属于任何特定的进程(在典型的实现中).
 

The file descriptions are internal to the kernel, and do not belong to any process in particular (in typical implementations).
 

文件描述符

从进程的角度来看,文件描述符是标识打开的文件、管道、套接字、FIFO 或设备的整数.

From the process viewpoint, file descriptors are integers that identify open files, pipes, sockets, FIFOs, or devices.

操作系统内核为每个进程保存一个描述符表.进程使用的文件描述符只是该表的索引.

The operating system kernel keeps a table of descriptors for each process. The file descriptor used by the process is simply an index to this table.

文件描述符表中的条目指的是内核文件描述.

The entries to in the file descriptor table refer to a kernel file description.

每当进程使用 dup()dup2() 复制文件描述符,内核只复制该进程的文件描述符表中的条目;它不会复制它自己保留的文件描述.

Whenever a process uses dup() or dup2() to duplicate a file descriptor, the kernel only duplicates the entry in the file descriptor table for that process; it does not duplicate the file description it keeps to itself.

当进程 fork 时,子进程会获得自己的文件描述符表,但条目仍然指向完全相同的内核文件描述.(这本质上是一个浅拷贝,所有文件描述符表条目都将被引用到文件描述.引用被复制;被引用的目标保持不变.)

When a process forks, the child process gets its own file descriptor table, but the entries still point to the exact same kernel file descriptions. (This is essentially a shallow copy, will all file descriptor table entries being references to file descriptions. The references are copied; the referred to targets remain the same.)

当一个进程通过 Unix Domain 套接字辅助消息向另一个进程发送文件描述符时,内核实际上会在接收器上分配一个新的描述符,并复制传输的描述符所引用的文件描述.

When a process sends a file descriptor to another process via an Unix Domain socket ancillary message, the kernel actually allocates a new descriptor on the receiver, and copies the file description the transferred descriptor refers to.

一切都很好,尽管 文件描述符"文件描述" 如此相似有点令人困惑.

It all works very well, although it is a bit confusing that "file descriptor" and "file description" are so similar.

所有这些与 OP 所看到的效果有什么关系?

What does all that have to do with the effects the OP is seeing?

每当创建新进程时,通常都会打开目标设备、管道或套接字,以及dup2() 标准输入、标准输出和标准错误的描述符.这导致所有三个标准描述符都引用相同的文件描述,因此无论使用一个文件描述符的操作有效,使用其他文件描述符的操作也是有效的.

Whenever new processes are created, it is common to open the target device, pipe, or socket, and dup2() the descriptor to standard input, standard output, and standard error. This leads to all three standard descriptors referring to the same file description, and thus whatever operation is valid using one file descriptor, is valid using the other file descriptors, too.

这在控制台上运行程序时最常见,因为这三个描述符肯定都指向同一个文件描述;并且该文件描述描述了伪终端字符设备的从端.

This is most common when running programs on the console, as then the three descriptors all definitely refer to the same file description; and that file description describes the slave end of a pseudoterminal character device.

考虑以下程序,run.c:

#define  _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <errno.h>

static void wrerrp(const char *p, const char *q)
{
    while (p < q) {
        ssize_t  n = write(STDERR_FILENO, p, (size_t)(q - p));
        if (n > 0)
            p += n;
        else
            return;
    }
}

static inline void wrerr(const char *s)
{
    if (s)
        wrerrp(s, s + strlen(s));
}

int main(int argc, char *argv[])
{
    int fd;

    if (argc < 3) {
        wrerr("
Usage: ");
        wrerr(argv[0]);
        wrerr(" FILE-OR-DEVICE COMMAND [ ARGS ... ]

");
        return 127;
    }

    fd = open(argv[1], O_RDWR | O_CREAT, 0666);
    if (fd == -1) {
        const char *msg = strerror(errno);
        wrerr(argv[1]);
        wrerr(": Cannot open file: ");
        wrerr(msg);
        wrerr(".
");
        return 127;
    }

    if (dup2(fd, STDIN_FILENO) != STDIN_FILENO ||
        dup2(fd, STDOUT_FILENO) != STDOUT_FILENO) {
        const char *msg = strerror(errno);
        wrerr("Cannot duplicate file descriptors: ");
        wrerr(msg);
        wrerr(".
");
        return 126;
    }
    if (dup2(fd, STDERR_FILENO) != STDERR_FILENO) {
        /* We might not have standard error anymore.. */
        return 126;
    }

    /* Close fd, since it is no longer needed. */
    if (fd != STDIN_FILENO && fd != STDOUT_FILENO && fd != STDERR_FILENO)
        close(fd);

    /* Execute the command. */
    if (strchr(argv[2], '/'))
        execv(argv[2], argv + 2);  /* Command has /, so it is a path */
    else
        execvp(argv[2], argv + 2); /* command has no /, so it is a filename */

    /* Whoops; failed. But we have no stderr left.. */
    return 125;
}

它需要两个或多个参数.第一个参数是文件或设备,第二个参数是命令,其余参数提供给命令.运行该命令,所有三个标准描述符都重定向到第一个参数中命名的文件或设备.您可以使用 gcc 编译上述内容,例如

It takes two or more parameters. The first parameter is a file or device, and the second is the command, with the rest of the parameters supplied to the command. The command is run, with all three standard descriptors redirected to the file or device named in the first parameter. You can compile the above with gcc using e.g.

gcc -Wall -O2 run.c -o run

让我们编写一个小型测试工具,report.c:

Let's write a small tester utility, report.c:

#define  _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>

int main(int argc, char *argv[])
{
    char    buffer[16] = { "
" };
    ssize_t result;
    FILE   *out;

    if (argc != 2) {
        fprintf(stderr, "
Usage: %s FILENAME

", argv[0]);
        return EXIT_FAILURE;
    }

    out = fopen(argv[1], "w");
    if (!out)
        return EXIT_FAILURE;

    result = write(STDIN_FILENO, buffer, 1);
    if (result == -1) {
        const int err = errno;
        fprintf(out, "write(STDIN_FILENO, buffer, 1) = -1, errno = %d (%s).
", err, strerror(err));
    } else {
        fprintf(out, "write(STDIN_FILENO, buffer, 1) = %zd%s
", result, (result == 1) ? ", success" : "");
    }

    result = read(STDOUT_FILENO, buffer, 1);
    if (result == -1) {
        const int err = errno;
        fprintf(out, "read(STDOUT_FILENO, buffer, 1) = -1, errno = %d (%s).
", err, strerror(err));
    } else {
        fprintf(out, "read(STDOUT_FILENO, buffer, 1) = %zd%s
", result, (result == 1) ? ", success" : "");
    }

    result = read(STDERR_FILENO, buffer, 1);
    if (result == -1) {
        const int err = errno;
        fprintf(out, "read(STDERR_FILENO, buffer, 1) = -1, errno = %d (%s).
", err, strerror(err));
    } else {
        fprintf(out, "read(STDERR_FILENO, buffer, 1) = %zd%s
", result, (result == 1) ? ", success" : "");
    }

    if (ferror(out))
        return EXIT_FAILURE;
    if (fclose(out))
        return EXIT_FAILURE;

    return EXIT_SUCCESS;
}

它只需要一个参数,一个要写入的文件或设备,来报告是否写入标准输入,以及从标准输出读取和错误工作.(我们通常可以在 Bash 和 POSIX shell 中使用 $(tty) 来引用实际的终端设备,以便在终端上可以看到报告.)使用例如

It takes exactly one parameter, a file or device to write to, to report whether writing to standard input, and reading from standard output and error work. (We can normally use $(tty) in Bash and POSIX shells, to refer to the actual terminal device, so that the report is visible on the terminal.) Compile this one using e.g.

gcc -Wall -O2 report.c -o report

现在,我们可以检查一些设备:

Now, we can check some devices:

./run /dev/null    ./report $(tty)
./run /dev/zero    ./report $(tty)
./run /dev/urandom ./report $(tty)

或任何我们想要的.在我的机器上,当我在文件上运行它时,说

or on whatever we wish. On my machine, when I run this on a file, say

./run some-file ./report $(tty)

写入标准输入,从标准输出和标准错误读取都可以工作——正如预期的那样,因为文件描述符指的是相同的、可读和可写的文件描述.

writing to standard input, and reading from standard output and standard error all work -- which is as expected, as the file descriptors refer to the same, readable and writable, file description.

经过上面的研究,结论是这里根本没有奇怪的行为.如果进程使用的文件描述符只是对操作系统内部文件描述的引用,并且标准输入、输出和错误描述符是重复相互关联.

The conclusion, after playing with the above, is that there is no strange behaviour here at all. It all behaves exactly as one would expect, if file descriptors as used by processes are simply references to operating system internal file descriptions, and standard input, output, and error descriptors are duplicates of each other.

这篇关于在 STDOUT 和 STDIN 的文件描述符上执行库函数的奇怪行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆