文本模式fseek / ftell [英] Text mode fseek/ftell

查看:78
本文介绍了文本模式fseek / ftell的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近遇到了一个问题。与文本文件和ftell / fseek相关,

我想知道它是否是一个bug,或者只是一个烦人的,但仍然符合b $ b符合,实现。


该平台是Windows,其中文本文件使用CF + LF(0x0d,0x0a)来标记行尾。
标记行尾。但是,有问题的文件是Unix格式的,

,每行末尾只有LF(0x0a)。


首先,做上述情况已经调用实施定义

或undefined行为?或者它仍然是定义的?


问题在于ftell()如何报告当前位置。 (并且,

随后fseek()回到同一位置是错误的。)


假设你有fread()以下12个字符,开始在

文件的开头:


''1''''''''''''''''''''''''''''' ''0x0a''1'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' ,一个0x0a用于结束

行。)


当你现在在文件中偏移12时,ftell()将返回14 ,

,因为它假设那些''\\ n''换行符实际上是CR + LF,而且

CR在读取时被删除了。 (如果这个文件是Windows格式,

你读完这12个字符后你真的会在偏移14处。)对于

fread()返回的每个0x0a,ftell( )假设你在文件中提出了两个

字符。


这里的最终结果是后续的fseek()到同一个位置

将是错误的。

那么,我是否通过在

Windows环境中读取Unix文本文件来调用未定义的行为?或者是允许编译器返回错误的

值作为实现定义的一部分。限制?或者是编译器运行时库中的这个

a错误?


-

+ ------ ------------------- + -------------------- + --------- -------------------- +

| Kenneth J. Brody | www.hvcomputer.com | |

| kenbrody / at\spamcop.net | www.fptech.com | #include< std_disclaimer.h> |

+ ------------------------- + -------------- ------ + ----------------------------- +

不要给我发电子邮件:< mailto:Th ************* @ gmail.com>

I recently ran into an "issue" related to text files and ftell/fseek,
and I''d like to know if it''s a bug, or simply an annoying, but still
conforming, implementation.

The platform is Windows, where text files use CF+LF (0x0d, 0x0a) to
mark end-of-line. The file in question, however, was in Unix format,
with only LF (0x0a) at the end of each line.

First, does the above situation already invoke "implementation defined"
or "undefined" behavior? Or is it still "defined"?

The problem comes in how ftell() reports the current position. (And,
subsequently fseek()ing back to the same position is wrong.)

Suppose that you have fread() the following 12 characters, starting at
the beginning of the file:

''1'' ''2'' ''3'' ''4'' ''5'' 0x0a ''1'' ''2'' ''3'' ''4'' ''5'' 0x0a

(Remember, this file is in Unix format, with a single 0x0a for end-of-
line.)

While you are now at offset 12 within the file, ftell() will return 14,
because it assumes that those ''\n'' newlines are really CR+LF, and that
the CR was stripped off when read. (Had this file been in Windows format,
you really would be at offset 14 after reading those 12 characters.) For
each 0x0a returned by fread(), ftell() will assume you have advanced two
characters in the file.

The net result here is that a subsequent fseek() to the same position
will be wrong.
So, have I invoked undefined behavior by reading a Unix text file in a
Windows environment? Or is the compiler allowed to return the "wrong"
value as part of an "implementation defined" restriction? Or is this
a bug in the compiler''s runtime library?

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don''t e-mail me at: <mailto:Th*************@gmail.com>

推荐答案

On星期五,2006年3月31日10:21:47 -0500,Kenneth Brody写道:
On Fri, 31 Mar 2006 10:21:47 -0500, Kenneth Brody wrote:
我最近遇到了一个问题。与文本文件和ftell / fseek有关,
我想知道它是否是一个bug,或者只是一个烦人但仍然符合要求的实现。
该平台是Windows,其中文本文件使用CF + LF(0x0d,0x0a)来标记行尾。但是,有问题的文件是Unix格式的,每行末尾只有LF(0x0a)。
首先,上述情况是否已经调用了实现定义。
或未定义行为?或者它仍然是定义的?


不,你应该没事。如果你打开文件作为二进制文件,将能够用fseek做更多的事情,但打开它作为文本也应该工作是

你遵守所施加的限制按标准(见后文)。

问题在于ftell()如何报告当前位置。 (并且,
随后fseek()回到同一位置是错误的。)

假设你有fread()以下12个字符,从
开始文件:

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''0x0a

(请记住,这个文件是Unix格式的,对于
行的结尾只有一个0x0a。)

当你现在在文件中偏移12时,ftell()将返回14,
因为它假设那些'\\'n''换行符确实是CR + LF,而且已在文件中前进两个字符。


实际上,你不能说这些数字。对于文本文件,

ftell不会给你偏移量。它返回的代码只能是fseek使用的
。你可能对你的实现编码方式是正确的,但你可以更清楚地了解所施加的限制

,如果你按照面值采用它 - ftell返回一些东西

除了把它传递给fseek之外你什么都不做。

这里的最终结果是后续的fseek()到同一位置
将是错的。


该标准允许一个人使用*仅* SEEK_SET以及先前调用ftell(或偏移量为0)的结果。如果这就是你所做的一切,

并且你没有回到你预期的位置,那么你可能会看到

你有一个不合规的库。 />

如果您使用了自己对流位置的想法(不是来自

ftell的结果),或者您使用了SEEK_END或SEEK_CUR,则所有投注均已关闭。

那么,我是否通过在Windows环境中读取Unix文本文件来调用未定义的行为?或者编译器是否允许返回错误值作为实现定义的一部分。限制?或者这是编译器运行时库中的错误吗?
I recently ran into an "issue" related to text files and ftell/fseek,
and I''d like to know if it''s a bug, or simply an annoying, but still
conforming, implementation.

The platform is Windows, where text files use CF+LF (0x0d, 0x0a) to
mark end-of-line. The file in question, however, was in Unix format,
with only LF (0x0a) at the end of each line.

First, does the above situation already invoke "implementation defined"
or "undefined" behavior? Or is it still "defined"?
No you should be OK. Will be able to do more things with fseek if you
open the file as a binary file, but opening it as text should also work is
you keep to the restrictions imposed by the standard (see later).
The problem comes in how ftell() reports the current position. (And,
subsequently fseek()ing back to the same position is wrong.)

Suppose that you have fread() the following 12 characters, starting at
the beginning of the file:

''1'' ''2'' ''3'' ''4'' ''5'' 0x0a ''1'' ''2'' ''3'' ''4'' ''5'' 0x0a

(Remember, this file is in Unix format, with a single 0x0a for end-of-
line.)

While you are now at offset 12 within the file, ftell() will return 14,
because it assumes that those ''\n'' newlines are really CR+LF, and that
the CR was stripped off when read. (Had this file been in Windows
format, you really would be at offset 14 after reading those 12
characters.) For each 0x0a returned by fread(), ftell() will assume you
have advanced two characters in the file.
Actually, you can''t say anything about the numbers. For a text file,
ftell does not give you the offset. It returns a code that can only be
used by fseek. You may be right about how you implementation is encoding
the data but you get a clearer understanding of the restrictions imposed
by the standard if you take it at face value -- ftell returns something
you can do nothing with except pass it to fseek.
The net result here is that a subsequent fseek() to the same position
will be wrong.
The standard allows one to fseek using *only* SEEK_SET and the result of a
previous call to ftell (or an offset of 0). If that is all you have done,
and you did not get back to where you expected, then it would seem that
you have a non-compliant library.

If you used you own idea of the stream position (not the result from
ftell) or you used SEEK_END or SEEK_CUR then all bets are off.
So, have I invoked undefined behavior by reading a Unix text file in a
Windows environment? Or is the compiler allowed to return the "wrong"
value as part of an "implementation defined" restriction? Or is this
a bug in the compiler''s runtime library?




一个示例程序,包含您的期望和发生的事情可能会使

一切都比较清楚。


-

Ben。



An example program with what you expect and what happens might make
everything clearer.

--
Ben.


周五,31 2006年3月10:21:47 -0500,Kenneth Brody

< ke ****** @ spamcop.net>在comp.lang.c中写道:
On Fri, 31 Mar 2006 10:21:47 -0500, Kenneth Brody
<ke******@spamcop.net> wrote in comp.lang.c:
我最近遇到了一个问题。与文本文件和ftell / fseek有关,
我想知道它是否是一个bug,或者只是一个烦人但仍然符合要求的实现。
该平台是Windows,其中文本文件使用CF + LF(0x0d,0x0a)来标记行尾。但是,有问题的文件是Unix格式的,每行末尾只有LF(0x0a)。
首先,上述情况是否已经调用了实现定义。
或未定义行为?或者它仍然是定义的?

问题在于ftell()如何报告当前位置。 (并且,
随后fseek()回到同一位置是错误的。)

假设你有fread()以下12个字符,从
开始文件:

''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' '''''''''''''''''''0x0a

(请记住,这个文件是Unix格式的,对于
行的结尾只有一个0x0a。)

当你现在在文件中偏移12时,ftell()将返回14,
因为它假设那些'\\'n''换行符确实是CR + LF,而且 fread()返回的每个0x0a,ftell()会假设你有两个
文件中的字符。

这里的最终结果是后续的fseek()到同一位置
将是错误的。

所以,有我通过在Windows环境中读取Unix文本文件来调用未定义的行为?或者编译器是否允许返回错误值作为实现定义的一部分。限制?或者这是编译器运行时库中的一个错误吗?
I recently ran into an "issue" related to text files and ftell/fseek,
and I''d like to know if it''s a bug, or simply an annoying, but still
conforming, implementation.

The platform is Windows, where text files use CF+LF (0x0d, 0x0a) to
mark end-of-line. The file in question, however, was in Unix format,
with only LF (0x0a) at the end of each line.

First, does the above situation already invoke "implementation defined"
or "undefined" behavior? Or is it still "defined"?

The problem comes in how ftell() reports the current position. (And,
subsequently fseek()ing back to the same position is wrong.)

Suppose that you have fread() the following 12 characters, starting at
the beginning of the file:

''1'' ''2'' ''3'' ''4'' ''5'' 0x0a ''1'' ''2'' ''3'' ''4'' ''5'' 0x0a

(Remember, this file is in Unix format, with a single 0x0a for end-of-
line.)

While you are now at offset 12 within the file, ftell() will return 14,
because it assumes that those ''\n'' newlines are really CR+LF, and that
the CR was stripped off when read. (Had this file been in Windows format,
you really would be at offset 14 after reading those 12 characters.) For
each 0x0a returned by fread(), ftell() will assume you have advanced two
characters in the file.

The net result here is that a subsequent fseek() to the same position
will be wrong.
So, have I invoked undefined behavior by reading a Unix text file in a
Windows environment? Or is the compiler allowed to return the "wrong"
value as part of an "implementation defined" restriction? Or is this
a bug in the compiler''s runtime library?




除了Ben'正确指出有关fseek()和

ftell()限制,你遗漏了一条重要信息,

即你是以文本还是二进制模式打开文件?


如果你在文本模式下打开一个文件,并且它实际上并不包含平台上文本文件的

格式,那么你就是骗你的编译器

及其库函数。如果你骗你的编译器,它将报复




-

Jack Klein

主页: http://JK-Technology.Com


comp.lang.c的常见问题解答 http:// c- faq.com/

comp.lang.c ++ http://www.parashift.com/c++-faq-lite/

alt.comp.lang.learn.c-c ++
http://www.contrib.andrew.cmu。 edu /~a ... FAQ-acllc.html



In addition to Ben''s pointing out correctly issues about fseek() and
ftell() limitations, you left out one piece of important information,
namely did you open the file in text or binary mode?

If you open a file in text mode, and it does not actually contain the
format for text files on your platform, you are lying to your compiler
and its library functions. If you lie to your compiler, it will get
its revenge.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html


On Fri,2006年3月31日13:03:01 -0600,Jack Klein写道:
On Fri, 31 Mar 2006 13:03:01 -0600, Jack Klein wrote:
周五,2006年3月31日10:21:47 -0500,Kenneth Brody
< ke ****** @ spamcop.net>在comp.lang.c中写道:
On Fri, 31 Mar 2006 10:21:47 -0500, Kenneth Brody
<ke******@spamcop.net> wrote in comp.lang.c:
我最近遇到了一个问题。与文本文件和ftell / fseek相关,
我想知道它是否是一个bug,或者只是一个烦人的,但仍然符合要求的实现。
I recently ran into an "issue" related to text files and ftell/fseek,
and I''d like to know if it''s a bug, or simply an annoying, but still
conforming, implementation.


< snip>
除了Ben'正确指出有关fseek()和
ftell()限制的问题,你遗漏了一条重要信息,
即你以文本或二进制模式打开文件?

如果您以文本模式打开文件,并且它实际上并不包含平台上文本文件的
格式,那么您就是骗你的编译器
及其库函数。如果你骗你的编译器,就会报复它。

<snip>
In addition to Ben''s pointing out correctly issues about fseek() and
ftell() limitations, you left out one piece of important information,
namely did you open the file in text or binary mode?

If you open a file in text mode, and it does not actually contain the
format for text files on your platform, you are lying to your compiler
and its library functions. If you lie to your compiler, it will get
its revenge.




啊。如果我知道这一点,我会简化我的回答在

二进制模式下打开它因为从OP报告的翻译可以告诉

该文件作为文本打开。我一直认为图书馆

必须跟踪它在线路结尾时所做的事情(包括本地和外国的b
)以保持其ftell / fseek承诺。


在我看来,因为你只能在某个地方找到你之前(通过阅读)的b $ b b。

的标准给出了更强有力的保证:外国格式的文本文件可以像本机文件一样被操纵。如果没有这样做,为什么?难道比我想象的更复杂吗?


-

Ben。



Ah. Had I known this I would have simplified my answer to "open it in
binary mode" since from the translations that the OP reports one can tell
that the file is opened as text. I had always assumed that the library
had to keep track of what it had been doing with line endings (both
native and foreign) in order to keep its ftell/fseek promise.

It seems to me that since you can only seek to somewhere you have been
before (by reading) that it would have been possible for the standard to
give the stronger guarantee: that foreign-format text files can be
manipulated just like native ones. If this was not done, why? Is it more
complicated than I imagine?

--
Ben.


这篇关于文本模式fseek / ftell的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆