如何为二进制和ASCII文件定义EOF [英] How EOF is defined for binary and ascii files
问题描述
我正在Windows(系统语言是日语)上编写C语言,但是我对二进制文件和ascii文件的EOF有疑问.
I'm programming C on Windows(system language is Japanese), and I have a problem about EOF of binary and ascii files.
上周我问了这个问题,一个好心人帮了我,但是我仍然无法真正理解程序在读取二进制文件或ascii文件时的工作方式.
I asked this question last week, a kind guy helped me, but I still can't really understand how the program works when reading a binary or an ascii file.
我做了以下测试:
Test1:
int oneChar;
iFile = fopen("myFile.tar.gz", "rb");
while ((oneChar = fgetc(iFile)) != EOF) {
printf("%d ", oneChar);
}
Test2:
int oneChar;
iFile = fopen("myFile.tar.gz", "r");
while ((oneChar = fgetc(iFile)) != EOF) {
printf("%d ", oneChar);
}
在test1情况下,对于二进制文件和ascii文件,一切运行正常.但是在test2中,程序在二进制文件中遇到 0x1A 时停止读取. (这是否表示 1A == EOF ??)ASCII表告诉我1A是一个称为 substitute 的控制字符(无论什么意思……),当我printf( (%d",EOF),但是它给了我 -1 ...
In the test1 case, things worked perfectly for both binary and ascii files. But in test2, program stopped reading when it encountered 0x1A in a binary file. (Does this mean that 1A == EOF?) ASCII table tells me that 1A is a control character called substitute (whatever that means...) And when I printf("%d", EOF), however, it gave me -1...
我还发现了此问题,告诉我操作系统确切知道文件的结尾,所以我真的不需要在文件中找到EOF,因为EOF超出了字节范围(大约1A?)
I also found this question which tells me that the OS knows exactly where a file ends, so I don't really need to find EOF in the file, because EOF is out of the range of a byte (what about 1A?)
有人可以帮我清理一下吗?预先感谢.
Can someone clear things up a little for me? Thanks in advance.
推荐答案
这是Windows特定于文本文件的技巧:SUB
字符,由 Ctrl + Z表示序列被fgetc
解释为EOF
.您不必在文本文件中包含1A
即可从fgetc
取回EOF
,但是:一旦到达文件的实际末尾,将返回EOF
.
This is a Windows-specific trick for text files: SUB
character, which is represented by Ctrl+Z sequence, is interpreted as EOF
by fgetc
. You do not have to have 1A
in your text file in order to get an EOF
back from fgetc
, though: once you reach the actual end of file, EOF
would be returned.
标准未将1A
定义为char
值来表示EOF
. EOF
的常量的类型为int
,其负值超出unsigned char
的范围.实际上,fgetc
返回int
而不是char
的原因是让它返回EOF
的特殊值.
The standard does not define 1A
as the char
value to represent an EOF
. The constant for EOF
is of type int
, with a negative value outside the range of unsigned char
. In fact, the reason why fgetc
returns an int
, not char
, is to let it return a special value for EOF
.
这篇关于如何为二进制和ASCII文件定义EOF的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!