为什么“while (!feof (file))"总是错误的? [英] Why is “while ( !feof (file) )” always wrong?

查看:35
本文介绍了为什么“while (!feof (file))"总是错误的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用 feof() 控制读取循环有什么问题?例如:

What is wrong with using feof() to control a read loop? For example:

#include <stdio.h>
#include <stdlib.h>

int
main(int argc, char **argv)
{
    char *path = "stdin";
    FILE *fp = argc > 1 ? fopen(path=argv[1], "r") : stdin;

    if( fp == NULL ){
        perror(path);
        return EXIT_FAILURE;
    }

    while( !feof(fp) ){  /* THIS IS WRONG */
        /* Read and process data from file… */
    }
    if( fclose(fp) != 0 ){
        perror(path);
        return EXIT_FAILURE;
    }
    return EXIT_SUCCESS;
}

这个循环有什么问题?

推荐答案

TL;DR

while(!feof) 是错误的,因为它测试了不相关的东西,而无法测试你需要知道的东西.结果是您错误地执行了假定它正在访问已成功读取的数据的代码,而实际上这从未发生过.

TL;DR

while(!feof) is wrong because it tests for something that is irrelevant and fails to test for something that you need to know. The result is that you are erroneously executing code that assumes that it is accessing data that was read successfully, when in fact this never happened.

我想提供一个抽象的、高层次的观点.因此,如果您对 while(!feof) 的实际作用感兴趣,请继续阅读.

I'd like to provide an abstract, high-level perspective. So continue reading if you're interested in what while(!feof) actually does.

I/O 操作与环境交互.环境不是您的程序的一部分,也不在您的控制之下.环境真正同时"存在用你的程序.与所有并发的事情一样,关于当前状态"的问题也很常见.没有意义:没有同时性"的概念;跨并发事件.状态的许多属性根本不会同时存在.

I/O operations interact with the environment. The environment is not part of your program, and not under your control. The environment truly exists "concurrently" with your program. As with all things concurrent, questions about the "current state" don't make sense: There is no concept of "simultaneity" across concurrent events. Many properties of state simply don't exist concurrently.

让我更准确地说:假设您要问您有更多数据吗".您可以询问并发容器或 I/O 系统.但答案通常是不可操作的,因此毫无意义.那么如果容器说是"呢?– 当您尝试阅读时,它可能不再有数据.类似地,如果答案是否",那么当您尝试阅读时,数据可能已经到达.结论是,根本没有像我有数据"这样的属性,因为您无法对任何可能的答案做出有意义的反应.(使用缓冲输入的情况稍好一些,您可能会得到是的,我有数据",这构成了某种保证,但您仍然必须能够处理相反的情况.而对于输出情况肯定和我描述的一样糟糕:你永远不知道那个磁盘或那个网络缓冲区是否已满.)

Let me make this more precise: Suppose you want to ask, "do you have more data". You could ask this of a concurrent container, or of your I/O system. But the answer is generally unactionable, and thus meaningless. So what if the container says "yes" – by the time you try reading, it may no longer have data. Similarly, if the answer is "no", by the time you try reading, data may have arrived. The conclusion is that there simply is no property like "I have data", since you cannot act meaningfully in response to any possible answer. (The situation is slightly better with buffered input, where you might conceivably get a "yes, I have data" that constitutes some kind of guarantee, but you would still have to be able to deal with the opposite case. And with output the situation is certainly just as bad as I described: you never know if that disk or that network buffer is full.)

因此我们得出结论,询问 I/O 系统是否能够执行 I/O 操作是不可能的,而且实际上是不合理的.我们与它交互的唯一可能方式(就像与并发容器一样)是尝试操作并检查它是成功还是失败.在您与环境交互的那一刻,只有那时您才能知道交互是否真的可能,并且在那时您必须承诺执行交互.(这是一个同步点",如果你愿意的话.)

So we conclude that it is impossible, and in fact unreasonable, to ask an I/O system whether it will be able to perform an I/O operation. The only possible way we can interact with it (just as with a concurrent container) is to attempt the operation and check whether it succeeded or failed. At that moment where you interact with the environment, then and only then can you know whether the interaction was actually possible, and at that point you must commit to performing the interaction. (This is a "synchronisation point", if you will.)

现在我们进入EOF.EOF 是您从尝试 I/O 操作中获得的响应.这意味着您正在尝试读取或写入某些内容,但这样做时您无法读取或写入任何数据,而是遇到了输入或输出的结尾.基本上所有 I/O API 都是如此,无论是 C 标准库、C++ iostream 还是其他库.只要 I/O 操作成功,您就无法知道以后的操作是否会成功.您必须总是先尝试操作,然后响应成功或失败.

Now we get to EOF. EOF is the response you get from an attempted I/O operation. It means that you were trying to read or write something, but when doing so you failed to read or write any data, and instead the end of the input or output was encountered. This is true for essentially all the I/O APIs, whether it be the C standard library, C++ iostreams, or other libraries. As long as the I/O operations succeed, you simply cannot know whether further, future operations will succeed. You must always first try the operation and then respond to success or failure.

在每个示例中,请仔细注意我们首先尝试 I/O 操作,如果结果有效,则然后使用结果.进一步注意,我们总是必须使用 I/O 操作的结果,尽管在每个示例中结果采用不同的形状和形式.

In each of the examples, note carefully that we first attempt the I/O operation and then consume the result if it is valid. Note further that we always must use the result of the I/O operation, though the result takes different shapes and forms in each example.

  • C stdio,从文件中读取:

  • C stdio, read from a file:

  for (;;) {
      size_t n = fread(buf, 1, bufsize, infile);
      consume(buf, n);
      if (n == 0) { break; }
  }

我们必须使用的结果是 n,即读取的元素数(可能少至零).

The result we must use is n, the number of elements that were read (which may be as little as zero).

  • C stdio,scanf:

  for (int a, b, c; scanf("%d %d %d", &a, &b, &c) == 3; ) {
      consume(a, b, c);
  }

我们必须使用的结果是scanf的返回值,转换的元素数.

The result we must use is the return value of scanf, the number of elements converted.

  • C++、iostreams 格式提取:

  • C++, iostreams formatted extraction:

  for (int n; std::cin >> n; ) {
      consume(n);
  }

我们必须使用的结果是 std::cin 本身,它可以在布尔上下文中评估并告诉我们流是否仍在 good()状态.

The result we must use is std::cin itself, which can be evaluated in a boolean context and tells us whether the stream is still in the good() state.

  • C++、iostreams 代码:

  • C++, iostreams getline:

  for (std::string line; std::getline(std::cin, line); ) {
      consume(line);
  }

我们必须再次使用的结果是 std::cin,就像以前一样.

The result we must use is again std::cin, just as before.

  • POSIX, write(2) 刷新缓冲区:

  char const * p = buf;
  ssize_t n = bufsize;
  for (ssize_t k = bufsize; (k = write(fd, p, n)) > 0; p += k, n -= k) {}
  if (n != 0) { /* error, failed to write complete buffer */ }

我们这里使用的结果是k,写入的字节数.这里的重点是我们只能知道在写操作之后写入了多少字节.

The result we use here is k, the number of bytes written. The point here is that we can only know how many bytes were written after the write operation.

  char *buffer = NULL;
  size_t bufsiz = 0;
  ssize_t nbytes;
  while ((nbytes = getline(&buffer, &bufsiz, fp)) != -1)
  {
      /* Use nbytes of data in buffer */
  }
  free(buffer);

我们必须使用的结果是 nbytes,直到并包括换行符的字节数(如果文件没有以换行符结尾,则为 EOF).

The result we must use is nbytes, the number of bytes up to and including the newline (or EOF if the file did not end with a newline).

请注意,当发生错误或到达 EOF 时,该函数会显式返回 -1(而不是 EOF!).

Note that the function explicitly returns -1 (and not EOF!) when an error occurs or it reaches EOF.

您可能会注意到,我们很少拼出实际的单词EOF".我们通常以我们更感兴趣的其他方式检测错误条件(例如,未能执行我们期望的 I/O).在每个示例中,都有一些 API 功能可以明确告诉我们已遇到 EOF 状态,但这实际上并不是非常有用的信息.这比我们经常关心的要多得多.重要的是 I/O 是否成功,而不是它是如何失败的.

You may notice that we very rarely spell out the actual word "EOF". We usually detect the error condition in some other way that is more immediately interesting to us (e.g. failure to perform as much I/O as we had desired). In every example there is some API feature that could tell us explicitly that the EOF state has been encountered, but this is in fact not a terribly useful piece of information. It is much more of a detail than we often care about. What matters is whether the I/O succeeded, more-so than how it failed.

  • 实际查询 EOF 状态的最后一个示例:假设您有一个字符串,并想测试它是否代表一个整数,末尾除了空格之外没有额外的位.使用 C++ iostreams,它是这样的:

  • A final example that actually queries the EOF state: Suppose you have a string and want to test that it represents an integer in its entirety, with no extra bits at the end except whitespace. Using C++ iostreams, it goes like this:

  std::string input = "   123   ";   // example

  std::istringstream iss(input);
  int value;
  if (iss >> value >> std::ws && iss.get() == EOF) {
      consume(value);
  } else {
      // error, "input" is not parsable as an integer
  }

我们在这里使用两个结果.第一个是 iss,流对象本身,检查格式化提取到 value 是否成功.但是,在也消耗了空格之后,我们执行另一个 I/O/操作,iss.get(),并期望它作为 EOF 失败,如果整个字符串已经被消耗,就是这种情况通过格式化提取.

We use two results here. The first is iss, the stream object itself, to check that the formatted extraction to value succeeded. But then, after also consuming whitespace, we perform another I/O/ operation, iss.get(), and expect it to fail as EOF, which is the case if the entire string has already been consumed by the formatted extraction.

在 C 标准库中,您可以通过检查结束指针是否已到达输入字符串的末尾来实现与 strto*l 函数类似的功能.

In the C standard library you can achieve something similar with the strto*l functions by checking that the end pointer has reached the end of the input string.

这篇关于为什么“while (!feof (file))"总是错误的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆