为什么fstream :: tellg()返回值被放大的输入文本文件中的换行符数,当文件为Windows(\r \\\<br/>)? [英] Why fstream::tellg() return value is enlarged by the number of newlines in the input text file, when file is formated for Windows (\r\n)?

查看:433
本文介绍了为什么fstream :: tellg()返回值被放大的输入文本文件中的换行符数,当文件为Windows(\r \\\<br/>)?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

程序打开输入文件并打印当前的读/写位置几次。



如果文件用'\\\
'形成换行符,



另一方面,如果换行符是'\r\\\
',看来在一些阅读之后,返回当前位置所有的tellg()调用被文件中换行符的数量偏移 - 输出为:0,5,6,7。



所有返回的值都增加4 ,它是示例输入文件中的多个换行符。

  #include< fstream> 
#include< iostream>
#include< iomanip>
using std :: cout;
using std :: setw;
using std :: endl;

int main()
{
std :: fstream ioff(su9.txt);
if(!ioff)return -1;
int c = 0;

cout<< setw(30)<< std :: left<< 任何操作之前< ioff.tellg()<< endl;

c = ioff.get();
cout<< setw(30)<< std :: left<< 第一次'get'<< ioff.tellg()<< 字符读:< (char)c < endl;

c = ioff.get();
cout<< setw(30)<< std :: left<< 第二次'get'<< ioff.tellg()<< 字符读:< (char)c < endl;

c = ioff.get();
cout<< setw(30)<< std :: left<< Third'get'<< ioff.tellg()<< \t\tCharacter read:<< (char)c < endl;

return 0;
}

输入文件为5行长(有4个换行符)

  -------------------------- ----------------- 
abcd
efgh
ijkl


------ --------------------------------------

输出(\\\
):

 操作0 
第一个'get'1字符读取:a
第二个'get'后2个字符读取:b
第三个'get'3字符读取:c



输出(\r\\\
):

 任何操作之前0 
第一次'get'后5个字符读取:a
第二个'get'后6个字符读取:b
第三个'注意,字符值是核心读取的。




$ b < h2_lin>解决方案

第一个,最明显的问题是,为什么你期望任何
特定值,当结果 tellg 被转换为
一个整数类型。
tellg 结果的唯一定义用法是 seekg 的稍后参数;他们没有定义
的数字意义,这是如此。



说的是:在Unix和Windows实现中,
实际上总是对应于文件中
物理位置的字节偏移量。这意味着如果文件以二进制模式打开,它们将有
一些意义;下
Windows例如,文本模式(默认)将文件中的两个
字符序列0x0D,0x0A映射到单个
字符'\\\
'
,并将单个字符0x1A视为
遇到文件结尾。 (二进制和文本模式是
在Unix下的缩进,所以当它们不被保证时,它们通常似乎在那里工作,甚至
。)



可能补充说,我不能用MSC ++重现你的结果。
不是那意味着什么;正如我所说, tellg 唯一的要求
是返回的值可以在 seekg
返回到同一个地方。 (另一个问题可能是你如何
创建了这些文件。例如,其中一个可以从BOM的UTF-8
编码开始,而另一个不是?)


Program openes input file and prints current reading/writing position several times.

If file is formated with '\n' for newline, values are as expected: 0, 1, 2, 3.

On the other side, if the newline is '\r\n' it appears that after some reading, current position returned by all tellg() calls are offsetted by the number of newlines in the file - output is: 0, 5, 6, 7.

All returned values are increased by 4, which is a number of newlines in example input file.

#include <fstream>
#include <iostream>
#include <iomanip>
using std::cout;
using std::setw;
using std::endl;

int main()
{
    std::fstream ioff("su9.txt");
    if(!ioff) return -1;
    int c = 0;

    cout << setw(30) << std::left << " Before any operation " << ioff.tellg() << endl;

    c = ioff.get();
    cout << setw(30) << std::left << " After first 'get' " << ioff.tellg() << " Character read: " << (char)c << endl;

    c = ioff.get();
    cout << setw(30) << std::left << " After second 'get' " << ioff.tellg() << " Character read: " << (char)c << endl;

    c = ioff.get();
    cout << setw(30) << std::left << " Third 'get' " << ioff.tellg() << "\t\tCharacter read: " << (char)c << endl;

    return 0;
}

Input file is 5 lines long (has 4 newlines), with a content:

-------------------------------------------
abcd
efgh
ijkl


--------------------------------------------

output (\n):

Before any operation         0
After first 'get'            1      Character read: a
After second 'get'           2      Character read: b
Third 'get'                  3      Character read: c

output (\r\n):

Before any operation         0
After first 'get'            5      Character read: a
After second 'get'           6      Character read: b
Third 'get'                  7      Character read: c

Notice that character values are read corectly.

解决方案

The first, and most obvious question, is why do you expect any particular values when teh results of tellg are converted to an integral type. The only defined use of the results of tellg is as a later argument to seekg; they have no defined numerical significance what so ever.

Having said that: in Unix and Windows implementations, they will practically always correspond to the byte offset of the physical position in the file. Which means that they will have some signification if the file is opened in binary mode; under Windows, for example, text mode (the default) maps the two character sequence 0x0D, 0x0A in the file to the single character '\n', and treats the single character 0x1A as if it had encountered end of file. (Binary and text mode are indentical under Unix, so things often seem to work there even when they aren't guaranteed.)

I might add that I cannot reproduce your results with MSC++. Not that that means anything; as I said, the only requirements for tellg is that the returned value can be used in a seekg to return to the same place. (Another issue might be how you created the files. Might one of them start with a UTF-8 encoding of a BOM, for example, and the other not?)

这篇关于为什么fstream :: tellg()返回值被放大的输入文本文件中的换行符数,当文件为Windows(\r \\\<br/>)?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
C/C++开发最新文章
热门教程
热门工具
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆