如何关闭文件? [英] How to close a file?

查看:71
本文介绍了如何关闭文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

经过多年的经验,我对Posix感到安心.

I felt at peace with Posix after many years of experience.

然后我读了大约2002年来自Linus Torvalds的消息:

Then I read this message from Linus Torvalds, circa 2002:

int ret;
do {
    ret = close(fd);
} while(ret == -1 && errno != EBADF);

否.

以上是

(a)不便携

(a) not portable

(b)不是当前惯例

不可移植"部分来自以下事实: out),一个线程环境,其中内核确实关闭了FD 如果出现错误,则可能已将FD有效地(由内核)重新使用了 其他线程,第二次关闭FD是一个错误.

The "not portable" part comes from the fact that (as somebody pointed out), a threaded environment in which the kernel does close the FD on errors, the FD may have been validly re-used (by the kernel) for some other thread, and closing the FD a second time is a BUG.

不仅循环直到EBADF不可移植,而且任何循环都是由于我可能会注意到的种族条件,如果我没有把这些事情视为理所当然而使和平".

Not only is looping until EBADF unportable, but any loop is, due to a race condition that I probably would have noticed if I hadn't "made peace" by taking such things for granted.

但是,在GCC C ++标准库实现basic_file_stdio.cc中,我们有

However, in the GCC C++ standard library implementation, basic_file_stdio.cc, we have

    do
      __err = fclose(_M_cfile);
    while (__err && errno == EINTR);

该库的主要目标是Linux,但似乎并没有注意Linus.

The primary target for this library is Linux, but it seems not to be heeding Linus.

据我了解,EINTR仅在系统调用 blocks 之后发生,这意味着内核在开始任何被中断的工作之前已收到释放描述符的请求.因此,无需循环.确实,SA_RESTART信号行为不适用于close,并且默认情况下会生成这样的循环,正是因为这是不安全的.

As far as I've come to understand, EINTR happens only after a system call blocks, which implies that the kernel received the request to free the descriptor before commencing whatever work got interrupted. So there's no need to loop. Indeed, the SA_RESTART signal behavior does not apply to close and generate such a loop by default, precisely because it is unsafe.

那么,这是一个标准的库错误,对吧?在C ++应用程序关闭的每个文件上.

This is a standard library bug then, right? On every file ever closed by a C++ application.

编辑:为避免在某个专家提出答案之前引起过多的警报,我应该注意

To avoid causing too much alarm before some guru comes along with an answer, I should note that close only seems to be allowed to block under specific circumstances, perhaps none of which ever apply to regular files. I'm not clear on all the details, but you should not see EINTR from close without opting into something by fcntl or setsockopt. Nevertheless the possibility makes generic library code more dangerous.

推荐答案

关于POSIX,对相关问题的答案非常清楚,简洁:close()是不可重启的特殊情况,并且不应使用循环.

With respect to POSIX, R..'s answer to a related question is very clear and concise: close() is a non-restartable special case, and no loop should be used.

这让我感到惊讶,因此我决定描述我的发现,然后总结我的结论并最终选择解决方案.

This was surprising to me, so I decided to describe my findings, followed by my conclusions and chosen solution at end.

这不是真正的答案.将其视为更像其他程序员的观点,包括该观点背后的理由.

This is not really an answer. Consider this more like the opinion of a fellow programmer, including the reasoning behind that opinion.

POSIX.1-2001 ,这意味着它可能已经关闭,也可能尚未关闭. EBADF表示fd不是有效的描述符.换句话说,POSIX.1明确建议使用

POSIX.1-2001 and POSIX.1-2008 describe three possible errno values that may occur: EBADF, EINTR, and EIO. The descriptor state after EINTR and EIO is "unspecified", which means it may or may not have been closed. EBADF indicates fd is not a valid descriptor. In other words, POSIX.1 clearly recommends using

    if (close(fd) == -1) {
        /* An error occurred, see 'errno'. */
    }

没有任何重试循环即可关闭文件描述符.

without any retry looping to close file descriptors.

(甚至提到了Austin Group 缺陷#519 R.,不能从close()错误中恢复:即使未打开描述符本身,也未指定在EINTR错误之后是否有可能进行任何I/O操作.)

(Even the Austin Group defect #519 R.. mentioned, does not help with recovering from close() errors: it leaves it unspecified whether any I/O is possible after an EINTR error, even if the descriptor itself is left open.)

对于Linux,close()系统调用在__do_close(). git/tree/fs/file.c"rel =" nofollow noreferrer> fs/file.c 管理描述符表锁定,然后filp_close()返回

For Linux, the close() syscall is defined in fs/open.c, with __do_close() in fs/file.c managing the descriptor table locking, and filp_close() back in fs/open.c taking care of the details.

总而言之,首先无条件地从表中删除描述符条目,首先是 ,然后是文件系统特定的刷新(f_op->flush()),接着是通知(dnotify/fsnotify钩子),最后是删除任何记录或文件锁定. (大多数本地文件系统(例如ext2,ext3,ext4,xfs,bfs,tmpfs等)都没有->flush(),因此给定有效的描述符,close()不会失败.只有ecryptfs,exofs,fuse,fifs,cif和据我所知,nfs在Linux-3.13.6中具有->flush()处理程序.)

In summary, the descriptor entry is removed from the table unconditionally first, followed by filesystem-specific flushing (f_op->flush()), followed by notification (dnotify/fsnotify hook), and finally by removing any record or file locks. (Most local filesystems like ext2, ext3, ext4, xfs, bfs, tmpfs, and so on, do not have ->flush(), so given a valid descriptor, close() cannot fail. Only ecryptfs, exofs, fuse, cifs, and nfs have ->flush() handlers in Linux-3.13.6, as far as I can tell.)

这确实意味着在Linux中,如果在close()期间特定于文件系统的->flush()处理程序中发生写错误,则没有办法重试;就像Torvalds所说的那样,文件描述符总是关闭的.

This does mean that in Linux, if a write error occurs in the filesystem-specific ->flush() handler during close(), there is no way to retry; the file descriptor is always closed, just like Torvalds said.

FreeBSD close() 手册页描述完全相同的行为.

The FreeBSD close() man page describes the exact same behaviour.

OpenBSD Mac OS X close()手册页描述了在出现错误的情况下是否关闭描述符,但是我相信它们具有FreeBSD的行为.

Neither the OpenBSD nor the Mac OS X close() man pages describe whether the descriptor is closed in case of errors, but I believe they share the FreeBSD behaviour.

对我来说似乎很明显,不需要或不需要循环就可以安全地关闭文件描述符.但是,close()可能仍会返回错误.

It seems clear to me that no loop is necessary or required to close a file descriptor safely. However, close() may still return an error.

errno == EBADF表示文件描述符已经关闭.如果我的代码意外地遇到了这种情况,对我来说,这表明代码逻辑中存在重大错误,并且该过程应正常退出;否则,请执行以下步骤:我宁愿进程死掉也不愿产生垃圾.

errno == EBADF indicates the file descriptor was already closed. If my code encounters this unexpectedly, to me it indicates there is a significant fault in the code logic, and the process should gracefully exit; I'd rather my processes die than produce garbage.

其他任何errno值都表明文件状态最终确定时出错.在Linux中,这肯定是与将所有剩余数据刷新到实际存储有关的错误.特别是,我可以想象ENOMEM在没有空间缓冲数据的情况下,EIO如果无法将数据发送或写入到实际的设备或介质,EPIPE如果到存储的连接丢失了, ENOSPC如果存储已满,并且不保留未刷新的数据,依此类推.如果该文件是日志文件,我将让进程报告失败并正常退出.如果文件内容在内存中仍然可用,我将删除(取消链接)整个文件,然后重试.否则,我会向用户报告失败.

Any other errno values indicate an error in finalizing the file state. In Linux, it is definitely an error related to flushing any remaining data to the actual storage. In particular, I can imagine ENOMEM in case there is no room to buffer the data, EIO if the data could not be sent or written to the actual device or media, EPIPE if connection to the storage was lost, ENOSPC if the storage is already full with no reservation to the unflushed data, and so on. If the file is a log file, I'd have the process report the failure and exit gracefully. If the file contents are still available in memory, I would remove (unlink) the entire file, and retry. Otherwise I'd report the failure to the user.

(请记住,在Linux和FreeBSD中,您不会在错误情况下泄漏"文件描述符;即使发生错误,也保证它们会关闭.我假设我可能使用的所有其他操作系统的行为方式相同)

(Remember that in Linux and FreeBSD, you do not "leak" file descriptors in the error case; they are guaranteed to be closed even if an error occurs. I am assuming all other operating systems I might use behave the same way.)

从现在开始我将使用的辅助功能将类似于

The helper function I'll use from now on will be something like

#include <unistd.h>
#include <errno.h>

/**
 * closefd - close file descriptor and return error (errno) code
 *
 * @descriptor: file descriptor to close
 *
 * Actual errno will stay unmodified.
*/
static int closefd(const int descriptor)
{
    int saved_errno, result;

    if (descriptor == -1)
        return EBADF;

    saved_errno = errno;

    result = close(descriptor);
    if (result == -1)
        result = errno;

    errno = saved_errno;
    return result;
}

我知道以上内容在Linux和FreeBSD上都是安全的,并且我认为在所有其他POSIX-y系统上都是安全的.如果遇到的不是,我可以简单地将其替换为自定义版本,并将其包装在适合该OS的#ifdef中.保持errno不变的原因只是我的编码风格的一个怪癖.它使短路错误路径更短(重复代码更少).

I know the above is safe on Linux and FreeBSD, and I assume it is safe on all other POSIX-y systems. If I encounter one that is not, I can simply replace the above with a custom version, wrapping it in a suitable #ifdef for that OS. The reason this maintains errno unchanged is just a quirk of my coding style; it makes short-circuiting error paths shorter (less repeated code).

如果我要关闭包含重要用户信息的文件,则将执行

If I am closing a file that contains important user information, I will do a fsync() or fdatasync() on it prior to closing. This ensures the data hits the storage, but also causes a delay compared to normal operation; therefore I won't do it for ordinary data files.

除非我将 unlink() ing 关闭文件,我将检查closefd()返回值,并采取相应措施.如果我可以轻松地重试,则可以,但是最多一次或两次.对于日志文件和生成/流式传输的文件,我仅警告用户.

Unless I will be unlink()ing the closed file, I will check closefd() return value, and act accordingly. If I can easily retry, I will, but at most once or twice. For log files and generated/streamed files, I only warn the user.

我想提醒所有阅读本文的人,我们不能使任何事情完全可靠;只是不可能.我们可以做的,我认为应该做的是,尽可能可靠地检测何时发生错误.如果我们可以轻松地进行资源使用重试,那么我们应该这样做.在所有情况下,我们都应确保将通知(关于错误)传播给实际的人工用户.让人们担心在重试该操作之前是否需要执行某些其他操作(可能很复杂).毕竟,很多工具仅用作较大任务的一部分,而最佳的操作方法通常取决于该较大任务.

I want to remind anyone reading this far that we cannot make anything completely reliable; it is just not possible. What we can do, and in my opinion should do, is to detect when an error occurs, as reliably as we can. If we can easily and with neglible resource use retry, we should. In all cases, we should make sure the notification (about the error) is propagated to the actual human user. Let the human worry about whether some other action, possibly complex, needs to be done before the operation is retried. After all, a lot of tools are used only as a part of a larger task, and the best course of action usually depends on that larger task.

这篇关于如何关闭文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆