fread用二进制文件过早发出EOF信号 [英] fread is signalling EOF prematurely with a binary file
问题描述
我是C的新手。我正在尝试制作我自己的 base64
版本;程序从stdin获取输入并输出相当于stdout的base64。在针对二进制文件测试我的程序时,我注意到来自stdin的 fread
-ing似乎在实际到达EOF之前提前返回了一个短计数。
I'm a newcomer to C. I'm attempting to make my own version of base64
; the program takes input from stdin and outputs its base64 equivalent to stdout. While testing my program against a binary file, I noticed that fread
-ing from stdin seemed to be returning a short count early before actually reaching EOF.
以下是我的主要方法的相关部分:
Here is the relevant portion of my main method:
int main(void)
{
unsigned char buffer[BUFFER_SIZE];
unsigned char base64_buffer[BASE64_BUFFER];
while (1)
{
TRACE_PUTS("Reading in data from stdin...");
size_t read = fread(buffer, 1, sizeof(buffer), stdin); /* Read the data in using fread(3) */
/* Process the buffer */
TRACE_PRINTF("Amount read: %zu\n", read);
TRACE_PUTS("Beginning base64 encode of buffer");
size_t encoded = base64_encode(buffer, read, base64_buffer, sizeof(base64_buffer));
/* Write the data to stdout */
TRACE_PUTS("Writing data to standard output");
...
if (read < sizeof(buffer))
{
break; /* We reached EOF or had an error during the read */
}
}
if (ferror(stdin))
{
/* Handle errors */
fprintf(stderr, "%s\n", "There was a problem reading from the file.");
exit(1);
}
puts(""); /* Output a newline before finishing */
return 0;
}
如您所见,主循环调用 fread
stdin上的每次迭代都进入一个缓冲区,然后在最后检查读取的数量是否小于缓冲区的大小。如果是,我们假设有一个错误(在这种情况下返回0)或达到EOF,并退出循环。
As you can see, the main loop calls fread
every iteration on stdin into a buffer, then at the end checks if the amount read is less than the size of the buffer. If it is, we assume there was either an error (in which case 0 was returned) or EOF was reached, and exit from the loop.
我假设它可以检查读取
是< sizeof(缓冲区)
,而不仅仅是!= 0
,基于来自fread的联机帮助页的引用:
I am assuming that it is OK to check read
to be < sizeof(buffer)
, rather than just != 0
, based on this quote from fread's manpage:
成功时,fread()和fwrite()返回读取或写入的项目数。此数字等于仅在大小为1时传输的字节数。如果发生错误或达到文件末尾,则返回值为短项目计数(或零)。
这意味着如果没有读入完整缓冲区,则达到EOF。
This implies that if the full buffer is not read into, then EOF is reached.
有了这个,这是我在针对 cat / bin / echo
运行我的应用程序时获得的跟踪:
With that established, this is the trace I get when I run my app against cat /bin/echo
:
$ cat /bin/echo | bin/base64 >/dev/null # only view the trace output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 600
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 600
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 600
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output
TRACE: C:/Users/James/Code/c/base64/main.c:23: Reading in data from stdin...
TRACE: C:/Users/James/Code/c/base64/main.c:28: Amount read: 569
TRACE: C:/Users/James/Code/c/base64/main.c:29: Beginning base64 encode of buffer
TRACE: C:/Users/James/Code/c/base64/main.c:43: Writing data to standard output
$
这里有多大 / bin / echo
实际上是:
$ cat /bin/echo | wc -c
28352
如您所见,整个文件长度为28352字节,但我的应用程序只能在它停止之前阅读约2400个。知道为什么吗? fread
特别处理空终止符吗?
So as you can see, the whole file is 28352 bytes long, but my app is only reading in about ~2400 of them before it stops. Any idea why? Does fread
handle null terminators specially?
如果有帮助,我将MinGW-w64与GCC一起使用;谢谢。
I am using MinGW-w64 with GCC if that helps; thanks.
推荐答案
你在Windows上吗?是的,路径名开始 C:
所以你是。你可能有一个Control-Z('\ x1A'
或'\ 32'
)字符文件。它(Windows C运行时,因此你的程序)不会将标准输入视为二进制文件,除非你以某种方式调整它,所以Control-Z标记输入的结束。
Are you on Windows? Yes, the pathname starts C:
so you are. You've probably got a Control-Z ('\x1A'
or '\32'
) character in the file. It (the Windows C run-time, and hence your program) won't treat standard input as a binary file unless you tweak it somehow, so the Control-Z marks the end of the input.
调整模式的一种可能方式是 _set_fmode()
。但是,您更有可能需要 _setmode()
:
One possible 'somehow' to tweak the mode is _set_fmode()
. However, it is more likely that you need _setmode()
:
_setmode(fileno(stdin), O_BINARY);
我保留判断这是否是最佳或唯一的方法。你也可以研究这些手册。我没办法测试 fileno()
- 或者微软世界中的 _fileno()
- 是否可用。
I reserve judgement on whether that's the best or only method for doing so. You can research the manuals as well as I can. I have no way to test that fileno()
— or perhaps _fileno()
in the Microsoft world — is available.
这篇关于fread用二进制文件过早发出EOF信号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!