在文本模式下使用seekg() [英] Using seekg() in text mode

查看:153
本文介绍了在文本模式下使用seekg()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

尝试在文本模式(Windows)下阅读一个简单的ANSI编码文本文件时,我遇到了一些奇怪的行为, seekg() tellg() ;任何时候,我试图使用 tellg(),保存它的值(作为pos_type),然后寻求到它,我会在流中比我离开的地方更远。 p>

最后我做了健全检查;即使我只是这样做...

  int main()
{
std :: ifstream dataFile(myfile.txt,
std :: ifstream :: in);
if(dataFile.is_open()&&!dataFile.fail())
{
while(dataFile.good())
{
std: :string line;
dataFile.seekg(dataFile.tellg());
std :: getline(dataFile,line);
}
}
}

进一步进入文件,行是半截断。为什么会发生这种情况?

解决方案

这个问题是由libstdc ++使用当前剩余缓冲区与 lseek64 以确定当前偏移量。



使用 read 的返回值设置缓冲区,对于Windows上的文本模式文件,返回数字在endline转换后被放入缓冲区的字节(即2字节 \r\\\
结束字符转换为 \\\
,窗口似乎在文件末尾附加了一个虚假的换行符)。



lseek64 然而(使用mingw会调用 _lseeki64 )返回当前绝对文件位置,一旦减去这两个值,最终得到一个偏移量,对于文本文件中的每个剩余换行符,偏移量为1(额外换行符为+1)。



以下代码应该显示问题,您甚至可以使用一个单一字符的文件,并且由于窗口插入的额外换行符而没有换行符。

  #include< iostream> 
#include< fstream>

int main()
{
std :: ifstream f(myfile.txt);

for(char c; f.get(c);)
std :: cout< f.tellg()<< '';
}

对于具有单个 a 字符我得到以下输出

  2 3 

第一次调用 tellg 时清除为1。



除了在二进制模式下打开文件之外,你还可以绕过禁用缓冲的问题

  #include< iostream> 
#include< fstream>

int main()
{
std :: ifstream f;
f.rdbuf() - > pubsetbuf(nullptr,0);
f.open(myfile.txt);

for(char c; f.get(c);)
std :: cout< f.tellg()<< '';
}

但这是远非理想的。



希望mingw / mingw-w64或gcc可以解决这个问题,但首先我们需要确定谁负责修复它。我想基本问题是MSs实现lseek,应根据文件如何打开返回适当的值。


While trying to read in a simple ANSI-encoded text file in text mode (Windows), I came across some strange behaviour with seekg() and tellg(); Any time I tried to use tellg(), saved its value (as pos_type), and then seek to it later, I would always wind up further ahead in the stream than where I left off.

Eventually I did a sanity check; even if I just do this...

int main()
{
   std::ifstream dataFile("myfile.txt",
         std::ifstream::in);
   if (dataFile.is_open() && !dataFile.fail())
   {
      while (dataFile.good())
      {
         std::string line;
         dataFile.seekg(dataFile.tellg());
         std::getline(dataFile, line);
      }
   }
}

...then eventually, further into the file, lines are half cut-off. Why exactly is this happening?

解决方案

This issue is caused by libstdc++ using the difference between the current remaining buffer with lseek64 to determine the current offset.

The buffer is set using the return value of read, which for a text mode file on windows returns the number of bytes that have been put into the buffer after endline conversion (i.e. the 2 byte \r\n endline is converted to \n, windows also seems to append a spurious newline to the end of the file).

lseek64 however (which with mingw results in a call to _lseeki64) returns the current absolute file position, and once the two values are subtracted you end up with an offset that is off by 1 for each remaining newline in the text file (+1 for the extra newline).

The following code should display the issue, you can even use a file with a single character and no newlines due to the extra newline inserted by windows.

#include <iostream>
#include <fstream>

int main()
{
  std::ifstream f("myfile.txt");

  for (char c; f.get(c);)
    std::cout << f.tellg() << ' ';
}

For a file with a single a character I get the following output

2 3

Clearly off by 1 for the first call to tellg. After the second call the file position is correct as the end has been reached after taking the extra newline into account.

Aside from opening the file in binary mode, you can circumvent the issue by disabling buffering

#include <iostream>
#include <fstream>

int main()
{
  std::ifstream f;
  f.rdbuf()->pubsetbuf(nullptr, 0);
  f.open("myfile.txt");

  for (char c; f.get(c);)
    std::cout << f.tellg() << ' ';
}

but this is far from ideal.

Hopefully mingw / mingw-w64 or gcc can fix this, but first we'll need to determine who would be responsible for fixing it. I suppose the base issue is with MSs implementation of lseek which should return appropriate values according to how the file has been opened.

这篇关于在文本模式下使用seekg()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆