Linux内核中实现open("/proc/self/fd/NUM")的代码在哪里? [英] Where's the code in the Linux kernel the implements open("/proc/self/fd/NUM")?

查看:659
本文介绍了Linux内核中实现open("/proc/self/fd/NUM")的代码在哪里?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直认为做open(/proc/self/fd/NUM, flags)等同于dup(NUM),但是显然不是这样!例如,如果您dup文件描述符,然后将新fd设置为非阻塞,这也会影响原始文件描述符(因为非阻塞状态是文件描述的属性,并且两个文件描述符都指向到相同的文件描述).但是,如果打开/proc/self/fd/NUM,则似乎会得到一个新的独立文件描述,并且可以独立设置新旧fds的非阻塞状态.您甚至可以使用它来获取两个引用同一匿名管道的文件描述,否则这是不可能的(示例).另一方面,虽然您可以dup套接字fd,但如果NUM引用了套接字,则open("/proc/self/fd/NUM", flags)将会失败.

I always assumed that doing open(/proc/self/fd/NUM, flags) was equivalent to dup(NUM), but apparently this is not the case! For example, if you dup a file descriptor, then set the new fd to non-blocking, this also affects the original file descriptor (because non-blocking state is a property of the file description, and the two file descriptors both point to the same file description). However, if you open /proc/self/fd/NUM, then you seem to get a new independent file description, and can set the non-blocking state of your old and new fds independently. You can even use this to get two file descriptions referring to the same anonymous pipe, which is otherwise impossible (example). On the other hand, while you can dup a socket fd, open("/proc/self/fd/NUM", flags) fails if NUM refers to a socket.

现在,我希望能够看到它对于其他类型的特殊文件的工作方式,并回答诸如以这种方式重新打开文件时进行了哪些权限检查?"之类的问题,所以我试图寻找Linux中实际上实现此路径的代码,但是当我开始阅读fs/proc/fd.c时,我很快迷失在一个曲折的操作结构迷宫中,这一切都不同.

Now I'd like to be able to see how this works for other types of special file, and answer questions like "what permission checking is done when re-opening a file this way?", so I was trying to find the code in Linux that actually implements this path, but when I started reading fs/proc/fd.c I quickly got lost in a maze of twisty operations structs, all different.

所以我的问题是:任何人都可以解释执行open("/proc/self/fd/NUM", flags)之后的代码路径吗?具体来说,我们假设NUM是指管道,我们正在谈论的是最新的内核版本.

So my question is: can anyone explain the code path followed by doing open("/proc/self/fd/NUM", flags)? For concreteness let's say that NUM refers to a pipe and we're talking about the latest kernel release.

推荐答案

注释建议您看一下proc_fd_link,这是个好主意.如果您在遵循代码的到达方式时遇到麻烦,则可以通过systemtap帮助自己.这是一个魔术脚本:

A comment suggests a look at proc_fd_link and that's a good idea. If you have trouble following how the code can get there, you can help yourself with systemtap. Here is a magic script:

probe kernel.function("proc_fd_link") {
    print_backtrace();
}

在fd/下打开文件时运行它会给出:

Running it while opening a file from under fd/ gives:

 0xffffffffbb2cad70 : proc_fd_link+0x0/0xd0 [kernel]
 0xffffffffbb2c4c3b : proc_pid_get_link+0x6b/0x90 [kernel] (inexact)
 0xffffffffbb36341a : security_inode_follow_link+0x4a/0x70 [kernel] (inexact)
 0xffffffffbb25bf13 : trailing_symlink+0x1e3/0x220 [kernel] (inexact)
 0xffffffffbb25f559 : path_openat+0xe9/0x1380 [kernel] (inexact)
 0xffffffffbb261af1 : do_filp_open+0x91/0x100 [kernel] (inexact)
 0xffffffffbb26fd8f : __alloc_fd+0x3f/0x170 [kernel] (inexact)
 0xffffffffbb24f280 : do_sys_open+0x130/0x220 [kernel] (inexact)
 0xffffffffbb24f38e : sys_open+0x1e/0x20 [kernel] (inexact)
 0xffffffffbb003c57 : do_syscall_64+0x67/0x160 [kernel] (inexact)
 0xffffffffbb8039e1 : return_from_SYSCALL_64+0x0/0x6a [kernel] (inexact)

在proc_pid_get_link中,我们看到:

In proc_pid_get_link we see:

/* Are we allowed to snoop on the tasks file descriptors? */
if (!proc_fd_access_allowed(inode))
        goto out;

aaaand

/* permission checks */
static int proc_fd_access_allowed(struct inode *inode)
{
        struct task_struct *task;
        int allowed = 0;
        /* Allow access to a task's file descriptors if it is us or we
         * may use ptrace attach to the process and find out that
         * information.
         */
        task = get_proc_task(inode);
        if (task) {
                allowed = ptrace_may_access(task, PTRACE_MODE_READ_FSCREDS);
                put_task_struct(task);
        }
        return allowed;
}

很显然,您需要与附加ptrace相同的权限.

clearly, you need the same perms as if you were attaching with ptrace.

最后,为什么打开套接字失败? strace显示正在返回ENXIO.快速的git grep ENXIO fs/*.c显示:

Finally, why does opening a socket fail? strace shows ENXIO being returned. A quick git grep ENXIO fs/*.c reveals:

static int no_open(struct inode *inode, struct file *file)
{
        return -ENXIO;
}

使用no_open检查代码的结局方式留给读者作为练习.还要注意,systemtap可以用于类似printf的调试,而无需修改源代码.也可以将其放在函数的返回"上,并报告错误代码.

Checking how the code ends up using no_open is left as an exercise for the reader. Also note systemtap can be used for printf-like debugging without modifying the source code. It also can be placed on 'return' from functions and report the error code.

这篇关于Linux内核中实现open("/proc/self/fd/NUM")的代码在哪里?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆