为什么Linux在目录上使用getdents()而不是read()? [英] Why does Linux use getdents() on directories instead of read()?

查看:109
本文介绍了为什么Linux在目录上使用getdents()而不是read()?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在浏览K& R C时发现,要读取目录中的条目,他们使用:

I was skimming through K&R C and I noticed that to read the entries in a directories, they used:

while (read(dp->fd, (char *) &dirbuf, sizeof(dirbuf)) == sizeof(dirbuf))
    /* code */

其中 dirbuf 是系统特定的目录结构,而 dp-> fd 是有效的文件描述符.在我的系统上, dirbuf 应该是 struct linux_dirent .请注意, struct linux_dirent 具有用于条目名称的灵活数组成员,但是为了简单起见,让我们假设它没有.(在这种情况下处理灵活数组成员只需要一点额外的样板代码即可.)

Where dirbuf was a system-specific directory structure, and dp->fd a valid file descriptor. On my system, dirbuf would have been a struct linux_dirent. Note that a struct linux_dirent has a flexible array member for the entry name, but let us assume, for the sake of simplicity, that it doesn't. (Dealing with the flexible array member in this scenario would only require a little extra boilerplate code).

但是,Linux不支持此构造.当使用 read()尝试如上所述读取目录条目时, read()返回 -1 ,而 errno 为设置为 EISDIR .

Linux, however, doesn't support this construct. When using read() to try reading directory entries as above, read() returns -1 and errno is set to EISDIR.

相反,Linux专门指定了一个用于读取目录的系统调用,即 getdents()系统调用.但是,我注意到它的工作原理与上面的差不多.

Instead, Linux dedicates a system call specifcally for reading directories, namely the getdents() system call. However, I've noticed that it works in pretty much the same way as above.

while (syscall(SYS_getdents, fd, &dirbuf, sizeof(dirbuf)) != -1)
    /* code */

背后的原因是什么?与使用K& R中的 read()相比,似乎几乎没有好处.

What was the rational behind this? There seems to be little/no benefit compared to using read() as done in K&R.

推荐答案

getdents 将返回 struct linux_dirent .它将对文件系统的任何基础类型执行此操作.在磁盘上"的格式可能完全不同,只有给定的文件系统驱动程序才知道,因此简单的用户空间读取调用将无法工作.也就是说, getdents 可能会从本机格式转换为填充 linux_dirent .

getdents will return struct linux_dirent. It will do this for any underlying type of filesystem. The "on disk" format could be completely different, known only to the given filesystem driver, so a simple userspace read call could not work. That is, getdents may convert from the native format to fill the linux_dirent.

对于使用read()从文件中读取字节不能说同样的话吗?文件中数据的磁盘格式不必在整个文件系统之间是统一的,甚至在磁盘上也不是连续的-因此,我希望从磁盘读取一系列字节再次成为我委派给文件系统驱动程序的东西.

couldn't the same thing be said about reading bytes from a file with read()? The on disk format of the data within a file isn't necessary uniform across filesystems or even contiguous on disk - thus, reading a series of bytes from disk would again be something I expect to be delegated to the file system driver.

由VFS [虚拟文件系统"]层处理的不连续文件数据.不管FS如何选择组织文件的阻止列表(例如ext4使用"inodes":"index"或"information"节点),这些节点都使用"ISAM"(索引顺序访问方法")组织.MS/DOS FS的组织可以完全不同.

The discontiguous file data in handled by the VFS ["virtual filesystem"] layer. Regardless of how a FS chooses to organize the block list for a file (e.g. ext4 uses "inodes": "index" or "information" nodes. these use an "ISAM" ("index sequential access method") organization. But, an MS/DOS FS can have a completely different organization).

每个FS驱动程序在启动时都会注册一个VFS函数回调表.对于给定的操作(例如 open/close/read/write/seek ),表中有相应的条目.

Each FS driver registers a table of VFS function callbacks when it's started. For a given operation (e.g. open/close/read/write/seek), there is corresponding entry in the table.

VFS层(即从用户空间syscall中)将调用"到FS驱动程序中,并且FS驱动程序将执行该操作,执行它认为满足请求所必需的一切.

The VFS layer (i.e. from the userspace syscall) will "call down" into the FS driver and the FS driver will perform the operation, doing whatever it deems necessary to fulfill the request.

我假设FS驱动程序将知道磁盘上常规文件内数据的位置-即使数据是零散的.

I assume that the FS driver would know about the location of the data inside a regular file on disk - even if the data was fragmented.

是的.例如,如果读取请求是要从文件中读取前三个块(例如0,1,2),则FS将查找文件的索引信息,并获取要读取的物理块的列表(例如1000000,200,37).所有这些都在FS驱动程序中透明地处理.

Yes. For example, if the read request is to read the first three blocks from the file (e.g. 0,1,2), the FS will look up the indexing information for the file and get a list of physical blocks to read (e.g. 1000000,200,37) from the disk surface. This is all handled transparently in the FS driver.

用户空间程序将仅看到其缓冲区中填充了正确的数据,而不必考虑FS索引和块获取的复杂程度.

The userspace program will simply see its buffer filled up with the correct data, without regard to how complex the FS indexing and block fetch had to be.

由于存在用于文件的索引节点(即,索引文件具有索引信息以分散/收集"文件的FS块),因此或许更恰当地将其称为传输索引节点数据".但是,FS驱动程序也在内部使用它从目录中读取.也就是说,每个目录都有一个索引节点来跟踪该目录的索引信息.

Perhaps it is [loosely] more proper to refer to this as transferring inode data as there are inodes for files (i.e. an inode has the indexing information to "scatter/gather" the FS blocks for the file). But, the FS driver also uses this internally to read from a directory. That is, each directory has an inode to keep track of the indexing information for that directory.

因此,对于FS驱动程序而言,目录非常类似于具有特殊格式信息的平面文件.这些是目录条目".这就是 getdents 返回的内容.这位于"索引节点索引层之上.

So, to an FS driver, a directory is much like a flat file that has specially formatted information. These are the directory "entries". This is what getdents returns. This "sits on top of" the inode indexing layer.

目录条目可以是可变长度的(基于文件名的长度).因此,磁盘上的格式应为(称为"Type A"):

Directory entries can be variable length [based on the length of the filename]. So, the on disk format would be (call it "Type A"):

static part|variable length name
static part|variable length name
...

但是...某些FS的组织方式有所不同(称为"Type B"):

But ... some FSes organize themselves differently (call it "Type B"):

<static1>,<static2>...
<variable1>,<variable2>,...

因此,可能通过用户空间 read(2)调用以原子方式读取类型A组织 ,类型B会遇到困难.因此, getdents VFS调用可以处理此问题.

So, the type A organization might be read atomically by a userspace read(2) call, the type B would have difficulty. So, the getdents VFS call handles this.

VFS不能像VFS一样提供文件的平面视图"吗?

couldn't the VFS also present a "linux_dirent" view of a directory like the VFS presents a "flat view" of a file?

这就是 getdents 的作用.

再一次,我假设FS驱动程序知道每个文件的类型,因此当在目录而不是一系列字节上调用read()时可能返回linux_dirent.

Then again, I'm assuming that a FS driver knows the type of each file and thus could return a linux_dirent when read() is called on a directory rather than a series of bytes.

getdents 并不总是存在.当Dirents固定大小且只有一种 FS格式时, readdir(3)调用可能在其下进行了 read(2)并得到一个一系列字节[仅 read(2)提供的字节].实际上,IIRC刚开始时只有 readdir(2),而 getdents readdir(3)不存在.

getdents did not always exist. When dirents were fixed size and there was only one FS format, the readdir(3) call probably did read(2) underneath and got a series of bytes [which is only what read(2) provides]. Actually, IIRC, in the beginning there was only readdir(2) and getdents and readdir(3) did not exist.

但是,如果 read(2)为短"(例如,两个字节太小),您该怎么办?您如何将其传达给应用?

But, what do you do if the read(2) is "short" (e.g. two bytes too small)? How do you communicate that to the app?

我的问题更像是因为FS驱动程序可以确定文件是目录还是常规文件(并且我认为可以),并且由于它最终必须拦截所有read()调用,所以为什么呢?在实现为读取linux_dirent的目录上读取了read()?

My question is more like since the FS driver can determine whether a file is a directory or a regular file (and I'm assuming it can), and since it has to intercept all read() calls eventually, why isn't read() on a directory implemented as reading the linux_dirent?

由于操作系统极简,因此不会拦截目录上的

read 并将其转换为 getdents .它希望您知道其中的区别并进行适当的系统调用.

read on a dir isn't intercepted and converted to getdents because the OS is minimalist. It expects you to know the difference and make the appropriate syscall.

您对文件或目录执行 open(2) [[code> opendir(3)是包装程序,在下面执行 open(2)).您可以读取/写入/寻找文件,并寻找/获取目录信息.

You do open(2) for files or dirs [opendir(3) is wrapper and does open(2) underneath]. You can read/write/seek for file and seek/getdents for dirs.

但是...对返回的 EISDIR read .[旁注:我在最初的评论中已经忘记了这一点].在它提供的简单的平面数据"模型中,没有一种方法可以传达/控制所有 getdents 可以/可以做的事情.

But ... doing read for returns EISDIR. [Side note: I had forgotten this in my original comments]. In the simple "flat data" model it provides, there isn't a way to convey/control all that getdents can/does.

因此,与其允许其次等方式获取部分/错误信息,不如让内核 和应用开发人员通过 getdents 接口进行操作.

So, rather than allow an inferior way to get partial/wrong info, it's simpler for the kernel and an app developer to go through the getdents interface.

此外, getdents 原子上执行.如果您正在读取给定程序中的目录条目,则可能有其他程序正在该目录中创建和删除文件或重命名这些文件-就在您的 getdents 序列中间.

Further, getdents does things atomically. If you're reading directory entries in a given program, there may be other programs that are creating and deleting files in that directory or renaming them--right in the middle of your getdents sequence.

getdents 将显示一个 atomic 视图.文件存在或不存在.它已被重命名或没有被重命名.因此,无论您周围发生了多少动荡",您都不会获得半修改"视图.当您向 getdents 询问20个条目时,您会得到它们(如果只有那么多则为10个).

getdents will present an atomic view. Either a file exists or it doesn't. It's been renamed or it hasn't. So, you don't get a "half modified" view, regardless of how much "turmoil" is happening around you. When you ask getdents for 20 entries, you'll get them [or 10 if there's only that much].

旁注:一个有用的技巧是过度指定"计数.也就是说,告诉 getdents 您需要50,000个条目[必须提供空格].通常您会得到大约100左右的东西.但是,现在,您可以及时得到整个目录的 atomic 快照.有时我这样做而不是循环计数1--YMMV.您仍然需要防止立即消失,但至少可以看到它(即,随后的文件打开失败)

Side note: A useful trick is to "overspecify" the count. That is, tell getdents you want 50,000 entries [you must provide the space]. You'll usually get back something like 100 or so. But, now, what you've got is an atomic snapshot in time for the full directory. I sometimes do this instead of looping with a count of 1--YMMV. You still have to protect against immediate disappearance but at least you can see it (i.e. a subsequent file open fails)

因此,对于 just 删除的文件,您始终会获得整个"条目和 no 条目.并不是说文件仍然存在,只是说在 getdents 时文件已经存在.另一个过程可能会立即删除它,但在 getdents

So, you always get "whole" entries and no entry for a just deleted file. That is not to say that the file is still there, merely that it was there at the time of the getdents. Another process may instantly erase it, but not in the middle of the getdents

如果允许 read(2) ,则您必须猜测要读取的数据量,并且不知道哪些条目完全形成在部分状态.如果FS具有上面的B型组织,那么一次读取就不可能 not 在单个步骤中自动获得静态部分和可变部分.

If read(2) were allowed, you'd have to guess at how much data to read and wouldn't know which entries were fully formed on in a partial state. If the FS had the type B organization above, a single read could not atomically get the static portion and variable portion in a single step.

从理论上讲,减慢 read(2)来执行 getdents 的操作是不正确的.

It would be philosophically incorrect to slow down read(2) to do what getdents does.

getdents unlink creat rmdir rename (等等).)操作是互锁的,并且进行了序列化,以防止出现任何不一致的情况(更不用说FS损坏或FS块泄漏/丢失).换句话说,这些系统调用都彼此了解".

getdents, unlink, creat, rmdir, and rename (etc.) operations are interlocked and serialized to prevent any inconsistencies [not to mention FS corruption or leaked/lost FS blocks]. In other words, these syscalls all "know about each other".

如果pgmA将"x"重命名为"z",而pgmB将"y"重命名为"z",则它们不要冲突.一个先走,另一个再走,但没有FS块丢失/泄漏. getdents 获取整个视图("x y","y z","x z"或"z"),但永远不会同时看到"x y z".

If pgmA renames "x" to "z" and pgmB renames "y" to "z", they don't collide. One goes first and another second but no FS blocks are ever lost/leaked. getdents gets the whole view (be it "x y", "y z", "x z" or "z"), but it will never see "x y z" simultaneously.

这篇关于为什么Linux在目录上使用getdents()而不是read()?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆