为什么从字符串提取一个流设置eof位? [英] Why does string extraction from a stream set the eof bit?
问题描述
假设我们有一个简单的流:
hello
pre>
请注意,在结尾处没有额外的
\\\
,因为在文本文件中经常出现。现在,以下简单代码显示在提取单个
std :: string
之后,在流上设置eof
int main(int argc,const char * argv [])
{
std :: stringstream ss(hello);
std :: string result;
ss>>结果;
std :: cout<< ss.eof()<< std :: endl; // Outputs 1
return 0;
}
但是,我不明白为什么会根据标准我正在阅读C ++ 11 - ISO / IEC 14882:2011(E))。
operator>>(basic_stream< ...&& basic _string< ...>&)
被定义为行为类似于格式化的输入函数。这意味着它构造一个sentry
对象,继续吃掉空格字符。在这个例子中,没有,所以sentry
构建完成没有问题。当转换为bool
时,sentry
对象给出true
,因此提取器继续进行字符串的实际提取。
然后提取被定义为:
字符被提取并附加,直到发生以下任何情况:
n
个字符;
- 输入序列上出现文件结束符;
- > isspace(c,is.getloc()) b
在提取最后一个字符(如果有)之后,调用is.width(0)并销毁哨兵对象k。
如果函数没有提取任何字符,它调用is.setstate(ios :: failbit)
,这可能会引发ios_base :: failure
(27.5.5.4)。
这里没有任何东西会导致
eof
位。是,提取停止,如果它命中文件结尾,但它不设置位。事实上,如果我们做另一个ss>>,那么
,因为当eof
result;sentry
试图吞噬空格时,会发生以下情况:
如果
is.rdbuf() - > sbumpc()
或is.rdbuf ; sgetc()
返回traits :: eof()
,函数调用setstate(failbit | eofbit)
但是,这绝对不会发生,因为
failbit
未设置。
设置
eof
位的结果是, -idiomwhile(!stream.eof())
无法读取文件是因为额外的\\\
$ c $因为eof
位尚未设置,所以在结尾而且不是。我的编译器很高兴地设置 eof
位,当提取停止在文件的结尾。
发生?或者是标准的意思是说
setstate(eofbit)
应该发生吗?
$ b b为了更容易,标准的相关部分是:
- 21.4.8.9插入和提取器[string.io]
- 27.7.2.2格式化的输入函数[istream.formatted]
- 27.7.2.1.3类
basic_istream :: sentry
> std :: stringstream 是basic_istream
和operator>>
c $ c> std :: string 从中提取字符(如您所见)。
27.7.2.1类模板<$如果rdbuf() - > sbumpc()或rdbuf() - > cbc, sgetc()返回traits :: eof(),然后输入函数,除了
明确指出,否则完成其操作,并执行setstate(eofbit),可能会抛出ios_-
base :: failure
此外,提取表示调用这两个函数。
3两组成员函数签名共享公共属性:格式化的输入函数(或
提取器)和无格式化的输入函数。两组输入函数都被描述为,它们
通过调用rdbuf() - > sbumpc()或rdbuf() - > sgetc()获取(或提取)输入字符。他们可以使用
istream的其他公开成员。
必须设置eof。
Let's say we have a stream containing simply:
hello
Note that there's no extra
\n
at the end like there often is in a text file. Now, the following simple code shows that theeof
bit is set on the stream after extracting a singlestd::string
.int main(int argc, const char* argv[]) { std::stringstream ss("hello"); std::string result; ss >> result; std::cout << ss.eof() << std::endl; // Outputs 1 return 0; }
However, I can't see why this would happen according to the standard (I'm reading C++11 - ISO/IEC 14882:2011(E)).
operator>>(basic_stream<...>&, basic_string<...>&)
is defined as behaving like a formatted input function. This means it constructs asentry
object which proceeds to eat away whitespace characters. In this example, there are none, so thesentry
construction completes with no problems. When converted to abool
, thesentry
object givestrue
, so the extractor continues to get on with the actual extraction of the string.The extraction is then defined as:
Characters are extracted and appended until any of the following occurs:
n
characters are stored;- end-of-file occurs on the input sequence;
isspace(c,is.getloc())
is true for the next available input character c.
After the last character (if any) is extracted, is.width(0) is called and the sentry object k is destroyed. If the function extracts no characters, it calls
is.setstate(ios::failbit)
, which may throwios_base::failure
(27.5.5.4).Nothing here actually causes the
eof
bit to be set. Yes, extraction stops if it hits the end-of-file, but it doesn't set the bit. In fact, theeof
bit should only be set if we do anotherss >> result;
, because when thesentry
attempts to gobble up whitespace, the following situation will occur:If
is.rdbuf()->sbumpc()
oris.rdbuf()->sgetc()
returnstraits::eof()
, the function callssetstate(failbit | eofbit)
However, this is definitely not happening yet because the
failbit
isn't being set.The consequence of the
eof
bit being set is that the only reason the evil-idiomwhile (!stream.eof())
doesn't work when reading files is because of the extra\n
at the end and not because theeof
bit isn't yet set. My compiler is happily setting theeof
bit when the extraction stops at the end of file.So should this be happening? Or did the standard mean to say that
setstate(eofbit)
should occur?
To make it easier, the relevant sections of the standard are:
- 21.4.8.9 Inserters and extractors [string.io]
- 27.7.2.2 Formatted input functions [istream.formatted]
- 27.7.2.1.3 Class
basic_istream::sentry
[istream::sentry]
解决方案std::stringstream
is abasic_istream
and theoperator>>
ofstd::string
"extracts" characters from it (as you found out).27.7.2.1 Class template
basic_istream
2 If rdbuf()->sbumpc() or rdbuf()->sgetc() returns traits::eof(), then the input function, except as explicitly noted otherwise, completes its actions and does setstate(eofbit), which may throw ios_- base::failure (27.5.5.4), before returning.
Also, "extracting" means calling these two functions.
3 Two groups of member function signatures share common properties: the formatted input functions (or extractors) and the unformatted input functions. Both groups of input functions are described as if they obtain (or extract) input characters by calling rdbuf()->sbumpc() or rdbuf()->sgetc(). They may use other public members of istream.
So eof must be set.
这篇关于为什么从字符串提取一个流设置eof位?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!