在linux中删除打开的文件时内部会发生什么 [英] What happens internally when deleting an opened file in linux

查看:64
本文介绍了在linux中删除打开的文件时内部会发生什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了有关在linux中删除打开的文件的问题

I came across this and this questions on deleting opened files in linux

但是,我仍然感到困惑的是,当一个进程(称为 A )删除另一个进程 B 所打开的文件时,RAM中发生了什么.

However, I'm still confused what happened in the RAM when a process(call it A) deletes an opened file by another process B.

这让我感到困惑(我的分析可能是错误的,请纠正我):

What baffles me is this(my analysis could be wrong, please correct me if so):

  • 当进程打开文件时,将在UFDT中为该文件创建一个新条目.
  • 当进程删除文件时,指向该文件的所有链接都消失了特别是,我们没有引用它的 inode ,因此,它已从GFDT中删除
  • 但是,在修改文件时(例如写入文件),必须在磁盘中进行更新(因为其页面被修改/脏了),但是由于删除的较早,它在GFDT中没有引用,所以我们不这样做.不知道它的 inode .
  • When a process opens a file, a new entry for that file in the UFDT is created.
  • When a process deletes a file, all the links to the file are gone especially, we have no reference to its inode, thus, it gets removed from the GFDT
  • However, when modifying the file(say writing to it) it must be updated in the disk(since its pages gets modified/dirty), but it got no reference in the GFDT because of the earlier delete, so we don't know the inode to it.

问题是为什么打开该文件的进程仍可以访问已删除"文件?然后如何由操作系统完成?

The Question is why the "deleted" file still accessible by the process which opened it? And how is that been done by the operating system?

编辑 UFDT 表示进程的文件描述符表,其中包含进程打开的文件的文件描述符(每个进程都有自己的UFDT)并且 GFDT 是全局文件描述符表,系统中只有一个GFDT(在我们的示例中为RAM).

EDIT By UFDT i mean the file descriptor table of the process which holds the file descriptors of the files which opened by the process(each process has its own UFDT) and the GFDT is the global file descriptor table, there is only one GFDT in the system(RAM in our case).

推荐答案

我从未真正听说过那些UFDT和GFDT的缩写词,但是您对系统的看法听起来似乎是正确的.我认为您对内核如何管理打开文件的描述缺少一些细节,也许这就是您困惑的地方.我将尝试给出更详细的描述.

I never really heard about those UFDT and GFDT acronyms, but your view of the system sounds mostly right. I think you lack some detail on your description of how open files are managed by the kernel, and perhaps this is where your confusion comes from. I'll try to give a more detailed description.

首先,存在三种用于跟踪和管理打开文件的数据结构:

First, there are three data structures used to keep track of and manage open files:

  • 每个进程都有一个文件描述符表.该表中的每个条目都存储一个文件描述符和文件描述符状态标志(到目前为止,唯一的标志是 O_CLOEXEC ).文件描述符只是指向文件表条目中某个条目的指针,我将在后面介绍. open(2)和family返回的整数通常是此文件描述符表的索引-每个进程都有其表,这就是 open(2)和family可能返回的原因打开不同文件的不同进程具有相同的值.
  • 整个系统中只有一个打开的文件表.每个进程的每个文件描述符表条目都引用打开的文件表中的这些条目之一.该表中每个打开的文件都有一个条目:如果两个进程打开同一个文件,则即使该文件是同一文件,也会在此全局表中创建两个条目.文件表中的每个条目都存储文件状态标志(已打开以进行读取,写入,附加等),以及当前文件偏移量.因此,只要每个进程打开文件,不同的进程就可以同时读取和写入同一文件中的不同偏移量.
  • 文件表条目中的每个条目还引用了vnode表中的条目.vnode表是一个全局表,每个唯一文件都有一个条目.如果进程A,B和C打开文件D,那么将只有一个vnode表条目,由所有3个文件表条目引用(在Linux中,实际上没有vnode,而是有一个inode,但是让我们保持它说明的一般性和概念性).vnode条目包含与传统inode几乎相同的信息(文件大小,其他属性等),但它还包含其他对打开的文件有用的信息,例如活动的文件锁,谁拥有它们,文件的哪些部分他们锁定的文件等.此vnode条目还在磁盘上存储了指向文件数据块的指针.
  • Each process has a table of file descriptors. Each entry in this table stores a file descriptor, and the file descriptor status flags (as of now, the only flag is O_CLOEXEC). The file descriptor is just a pointer to an entry in the file table entry, which I cover next. The integer returned by open(2) and family is usually an index into this file descriptor table - each process has its table, that's why open(2) and family may return the same value for different processes opening different files.
  • There is one opened files table in the entire system. Each file descriptor table entry of each process references one of these entries in the opened files table. There is one entry in this table for each opened file: if two processes open the same file, two entries in this global table are created, even though it's the same file. Each entry in the files table stores the file status flags (opened for reading, writing, appending, etc), and the current file offset. This is why different processes can read from and write to different offsets in the same file concurrently as long as each of them opens the file.
  • Each entry in the file table entry also references an entry in the vnode table. The vnode table is a global table that has one entry for each unique file. If processes A, B, and C open file D, there will be only one vnode table entry, referenced by all 3 of the file table entries (in Linux, there is really no vnode, rather there is an inode, but let's keep this description generic and conceptual). The vnode entry contains pretty much the same information as the traditional inode (file size, other attributes, etc.), but it also contains other information useful for opened files, such as file locks that are active, who owns them, which portions of the file they lock, etc. This vnode entry also stores pointers to the file's data blocks on disk.

删除文件包括调用 unlink(2).此功能从目录取消链接文件.磁盘中的每个文件索引节点都有指向它的链接数的计数.仅当链接计数达到0并且未打开时才真正删除该文件(对于目录,则为2,因为目录引用自身,并且目录也由其父目录引用).实际上, unlink(2)的联机帮助页对此行为非常具体:

Deleting a file consists of calling unlink(2). This function unlinks a file from a directory. Each file inode in disk has a count of the number of links pointing to it; the file is only really removed if the link count reaches 0 and it is not opened (or 2 in the case of directories, since a directory references itself and is also referenced by its parent). In fact, the manpage for unlink(2) is very specific about this behavior:

取消链接-删除名称及其引用的文件

unlink - delete a name and possibly the file it refers to

因此,与其将取消链接视为删除文件,不如将其视为删除文件名,甚至删除它所引用的文件.

So, instead of looking at unlinking as deleting a file, look at it as deleting a file name, and maybe the file it refers to.

unlink(2)检测到存在引用此文件的活动vnode表条目时,它不会从文件系统中删除该文件.没发生什么事.是的,您无法再在文件系统上找到该文件. find(1)找不到它.您无法在新流程中打开它.

When unlink(2) detects that there is an active vnode table entry referring this file, it doesn't delete the file from the filesystem. Nothing happens. Yes, you can't find the file on your filesystem anymore. find(1) won't find it. You can't open it in new processes.

但是文件仍然存在.它只是没有出现在任何目录条目中.

例如,如果文件很大,并且运行 df du ,则会发现空间使用情况是相同的.该文件仍在磁盘上,您无法访问.

For example, if it's a huge file, and if you run df or du, you will see that space usage is the same. The file is still there, on disk, you just can't reach it.

因此,任何读取或写入操作都照常进行-可通过vnode表条目访问文件数据块.您仍然可以知道文件的大小.和所有者.和权限.所有的.一切都在那里.

So, any reads or writes take place as usual - the file data blocks are accessible through the vnode table entry. You can still know the file size. And the owner. And the permissions. All of it. Everything's there.

当进程终止或显式关闭文件时,操作系统将检查inode.如果指向inode的链接数为0,并且这是打开文件的最后一个进程(也通过在vnode表条目中存储链接计数来指示),则将清除该文件.

When the process terminates or explicitly closes the file, the operating system checks the inode. If the number of links pointing to the inode is 0 and this was the last process that opened the file (which is also indicated by storing a link count in the vnode table entry), then the file is purged.

这篇关于在linux中删除打开的文件时内部会发生什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆