为什么从字符串提取一个流设置eof位? [英] Why does string extraction from a stream set the eof bit?

查看:86
本文介绍了为什么从字符串提取一个流设置eof位?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个简单的流:

  hello 
pre>

请注意,在结尾处没有额外的 \\\
,因为在文本文件中经常出现。现在,以下简单代码显示在提取单个 std :: string 之后,在流上设置 eof

  int main(int argc,const char * argv [])
{
std :: stringstream ss(hello);
std :: string result;
ss>>结果;
std :: cout<< ss.eof()<< std :: endl; // Outputs 1
return 0;
}

但是,我不明白为什么会根据标准我正在阅读C ++ 11 - ISO / IEC 14882:2011(E))。 operator>>(basic_stream< ...&& basic _string< ...>&)被定义为行为类似于格式化的输入函数。这意味着它构造一个 sentry 对象,继续吃掉空格字符。在这个例子中,没有,所以 sentry 构建完成没有问题。当转换为 bool 时, sentry 对象给出 true ,因此提取器继续进行字符串的实际提取。



然后提取被定义为:


字符被提取并附加,直到发生以下任何情况:




  • n 个字符;

  • 输入序列上出现文件结束符;

  • > isspace(c,is.getloc()) b

    在提取最后一个字符(如果有)之后,调用is.width(0)并销毁哨兵对象k。
    如果函数没有提取任何字符,它调用 is.setstate(ios :: failbit),这可能会引发 ios_base :: failure (27.5.5.4)。


这里没有任何东西会导致 eof 位。是,提取停止,如果它命中文件结尾,但它不设置位。事实上,如果我们做另一个 ss>>,那么 eof result; ,因为当 sentry 试图吞噬空格时,会发生以下情况:


如果 is.rdbuf() - > sbumpc() is.rdbuf ; sgetc()返回 traits :: eof(),函数调用 setstate(failbit | eofbit)


但是,这绝对不会发生,因为 failbit 未设置。



设置 eof 位的结果是, -idiom while(!stream.eof())无法读取文件是因为额外的 \\\
eof
位尚未设置,所以在结尾而且不是。我的编译器很高兴地设置 eof 位,当提取停止在文件的结尾。



发生?或者是标准的意思是说 setstate(eofbit)应该发生吗?





$ b b

为了更容易,标准的相关部分是:




  • 21.4.8.9插入和提取器[string.io]

  • 27.7.2.2格式化的输入函数[istream.formatted]

  • 27.7.2.1.3类 basic_istream :: sentry > std :: stringstream 是 basic_istream operator>> c $ c> std :: string 从中提取字符(如您所见)。



    27.7.2.1类模板<$如果rdbuf() - > sbumpc()或rdbuf() - > cbc, sgetc()返回traits :: eof(),然后输入函数,除了
    明确指出,否则完成其操作,并执行setstate(eofbit),可能会抛出ios_-
    base :: failure


    此外,提取表示调用这两个函数。


    3两组成员函数签名共享公共属性:格式化的输入函数(或
    提取器)和无格式化的输入函数。两组输入函数都被描述为,它们
    通过调用rdbuf() - > sbumpc()或rdbuf() - > sgetc()获取(或提取)输入字符。他们可以使用
    istream的其他公开成员。


    必须设置eof。


    Let's say we have a stream containing simply:

    hello
    

    Note that there's no extra \n at the end like there often is in a text file. Now, the following simple code shows that the eof bit is set on the stream after extracting a single std::string.

    int main(int argc, const char* argv[])
    {
      std::stringstream ss("hello");
      std::string result;
      ss >> result;
      std::cout << ss.eof() << std::endl; // Outputs 1
      return 0;
    }
    

    However, I can't see why this would happen according to the standard (I'm reading C++11 - ISO/IEC 14882:2011(E)). operator>>(basic_stream<...>&, basic_string<...>&) is defined as behaving like a formatted input function. This means it constructs a sentry object which proceeds to eat away whitespace characters. In this example, there are none, so the sentry construction completes with no problems. When converted to a bool, the sentry object gives true, so the extractor continues to get on with the actual extraction of the string.

    The extraction is then defined as:

    Characters are extracted and appended until any of the following occurs:

    • n characters are stored;
    • end-of-file occurs on the input sequence;
    • isspace(c,is.getloc()) is true for the next available input character c.

    After the last character (if any) is extracted, is.width(0) is called and the sentry object k is destroyed. If the function extracts no characters, it calls is.setstate(ios::failbit), which may throw ios_base::failure (27.5.5.4).

    Nothing here actually causes the eof bit to be set. Yes, extraction stops if it hits the end-of-file, but it doesn't set the bit. In fact, the eof bit should only be set if we do another ss >> result;, because when the sentry attempts to gobble up whitespace, the following situation will occur:

    If is.rdbuf()->sbumpc() or is.rdbuf()->sgetc() returns traits::eof(), the function calls setstate(failbit | eofbit)

    However, this is definitely not happening yet because the failbit isn't being set.

    The consequence of the eof bit being set is that the only reason the evil-idiom while (!stream.eof()) doesn't work when reading files is because of the extra \n at the end and not because the eof bit isn't yet set. My compiler is happily setting the eof bit when the extraction stops at the end of file.

    So should this be happening? Or did the standard mean to say that setstate(eofbit) should occur?


    To make it easier, the relevant sections of the standard are:

    • 21.4.8.9 Inserters and extractors [string.io]
    • 27.7.2.2 Formatted input functions [istream.formatted]
    • 27.7.2.1.3 Class basic_istream::sentry [istream::sentry]

    解决方案

    std::stringstream is a basic_istream and the operator>> of std::string "extracts" characters from it (as you found out).

    27.7.2.1 Class template basic_istream

    2 If rdbuf()->sbumpc() or rdbuf()->sgetc() returns traits::eof(), then the input function, except as explicitly noted otherwise, completes its actions and does setstate(eofbit), which may throw ios_- base::failure (27.5.5.4), before returning.

    Also, "extracting" means calling these two functions.

    3 Two groups of member function signatures share common properties: the formatted input functions (or extractors) and the unformatted input functions. Both groups of input functions are described as if they obtain (or extract) input characters by calling rdbuf()->sbumpc() or rdbuf()->sgetc(). They may use other public members of istream.

    So eof must be set.

    这篇关于为什么从字符串提取一个流设置eof位?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆