getchar/fgetc 和 putchar/fputc 中 int 和 char 的区别? [英] Difference between int and char in getchar/fgetc and putchar/fputc?
问题描述
我正在尝试自己学习 C,但我对 getchar
和 putchar
有点困惑:
I am trying to learn C on my own and I'm kind of confused with getchar
and putchar
:
#include <stdio.h>
int main(void)
{
char c;
printf("Enter characters : ");
while((c = getchar()) != EOF){
putchar(c);
}
return 0;
}
2
#include <stdio.h>
int main(void)
{
int c;
printf("Enter characters : ");
while((c = getchar()) != EOF){
putchar(c);
}
return 0;
}
C 库函数 int putchar(int c)
将参数 char 指定的字符(无符号字符)写入 stdout.
The C library function int putchar(int c)
writes a character (an unsigned char) specified by the argument char to stdout.
C 库函数 int getchar(void)
从 stdin 获取一个字符(无符号字符).这相当于以 stdin 作为参数的 getc.
The C library function int getchar(void)
gets a character (an unsigned char) from stdin. This is equivalent to getc with stdin as its argument.
这是否意味着 putchar()
接受 int
和 char
或其中之一以及 getchar()
> 我们应该使用 int
还是 char
?
Does it mean putchar()
accepts both int
and char
or either of them and for getchar()
should we use an int
or char
?
推荐答案
TL;DR:
char c;c = getchar();
是错误、损坏和错误.int c;c = getchar();
正确.
这也适用于 getc
和 fgetc
,甚至更多,因为人们通常会读到文件末尾.
This applies to getc
and fgetc
as well, if not even more so, because one would often read until the end of the file.
始终存储 getchar
(fgetc
, getc
...) (和 putchar
) 的返回值最初转换为 int
类型的变量.
Always store the return value of getchar
(fgetc
, getc
...) (and putchar
) initially into a variable of type int
.
putchar
的参数可以是int
、char
、signed char
或 unsigned char
;它的类型无关紧要,并且所有这些都相同,即使一个可能导致正整数和其他负整数被传递给上面的字符,包括 200
(128).
The argument to putchar
can be any of int
, char
, signed char
or unsigned char
; its type doesn't matter, and all of them work the same, even though one might result in positive and other in negative integers being passed for characters above and including 200
(128).
你必须使用int
来存储getchar
和的返回值的原因putchar
是当达到文件结束条件(或发生 I/O 错误)时,两者都返回宏 EOF
的值,该值是一个负整数常量, (通常为 -1
).
The reason why you must use int
to store the return value of both getchar
and putchar
is that when the end-of-file condition is reached (or an I/O error occurs), both of them return the value of the macro EOF
which is a negative integer constant, (usually -1
).
对于getchar
,如果返回值不是EOF
,则是读取的unsigned char
0-扩展为 int
.即假设8位字符,返回的值可以是0
...255
或宏EOF
的值;再次假设是 8 位字符,没有办法将这 257 个不同的值压缩到 256 个中,以便可以唯一地识别它们中的每一个.
For getchar
, if the return value is not EOF
, it is the read unsigned char
zero-extended to an int
. That is, assuming 8-bit characters, the values returned can be 0
...255
or the value of the macro EOF
; again assuming 8-bit char, there is no way to squeeze these 257 distinct values into 256 so that each of them could be identified uniquely.
现在,如果您将其存储到 char
中,则效果将取决于 字符类型是否为默认签名或未签名!这因编译器而异,因架构而异.如果 char
已签名并假设 EOF
定义为 -1
,则 both EOF
> 和输入的字符 '377'
比较等于 EOF
;它们会被符号扩展为 (int)-1
.
Now, if you stored it into char
instead, the effect would depend on whether the character type is signed or unsigned by default! This varies from compiler to compiler, architecture to architecture. If char
is signed and assuming EOF
is defined as -1
, then both EOF
and character '377'
on input would compare equal to EOF
; they'd be sign-extended to (int)-1
.
另一方面,如果 char
是未签名的(默认情况下在 ARM 处理器上是这样,包括 Raspberry PI 系统;对于 AIX 也是),没有值可以存储在 c
中,比较等于 <代码>-1代码>;包括EOF
;您的代码将输出一个 377
字符,而不是在 EOF
上突破.
On the other hand, if char
is unsigned (as it is by default on ARM processors, including Raspberry PI systems; and seems to be true for AIX too), there is no value that could be stored in c
that would compare equal to -1
; including EOF
; instead of breaking out on EOF
, your code would output a single 377
character.
这里的危险在于,使用带符号的 char
代码 似乎可以正常工作,即使它仍然严重损坏 - 合法输入值之一被解释为 EOF
. 此外,C89、C99、C11 不要求 EOF
的值;它只说 EOF
是一个负整数常量;因此,除了 -1
之外,还可以在特定实现上说 -224
,这会导致空格的行为类似于 EOF
.
The danger here is that with signed char
s the code seems to be working correctly even though it is still horribly broken - one of the legal input values is interpreted as EOF
. Furthermore, C89, C99, C11 does not mandate a value for EOF
; it only says that EOF
is a negative integer constant; thus instead of -1
it could as well be say -224
on a particular implementation, which would cause spaces behave like EOF
.
gcc
具有开关 -funsigned-char
可用于使 char
在默认为有符号的平台上无符号:
gcc
has the switch -funsigned-char
which can be used to make the char
unsigned on those platforms where it defaults to signed:
% cat test.c
#include <stdio.h>
int main(void)
{
char c;
printf("Enter characters : ");
while ((c = getchar()) != EOF){
putchar(c);
}
return 0;
}
现在我们用签名的char
运行它:
Now we run it with signed char
:
% gcc test.c && ./a.out
Enter characters : sfdasadfdsaf
sfdasadfdsaf
^D
%
似乎工作正常.但是使用未签名的 char
:
Seems to be working right. But with unsigned char
:
% gcc test.c -funsigned-char && ./a.out
Enter characters : Hello world
Hello world
���������������������������^C
%
也就是说,我多次尝试按 Ctrl-D
但为每个 EOF
打印了
而不是打破循环.
That is, I tried to press Ctrl-D
there many times but a �
was printed for each EOF
instead of breaking the loop.
现在,再次,对于签名的 char
情况,它无法区分 char
255 和 EOF
在 Linux 上,将其分解为二进制数据诸如此类:
Now, again, for the signed char
case, it cannot distinguish between char
255 and EOF
on Linux, breaking it for binary data and such:
% gcc test.c && echo -e 'Hello world 377And some more' | ./a.out
Enter characters : Hello world
%
只有