当它只读取一个换行符时,gets() 会保存什么 [英] What does gets() save when it reads just a newline
问题描述
以下是 Prata 的 C Primer Plus 中对 gets()
的描述:
它从您系统的标准输入设备中获取一个字符串,通常你的键盘.因为字符串没有预先确定的长度,gets()
需要一种方法来知道何时停止.它的方法是读取字符直到它到达一个换行符 (\n
) 字符,该字符由您生成按 Enter 键.它需要所有字符(但不是包括)换行符,附加一个空字符(\0
),并给出字符串到调用程序.
我很好奇当 gets()
只读入一个换行符时会发生什么.所以我写了这个:
int main(void){字符输入[100];而(获取(输入)){printf("这是字符串形式的输入:%s\n", input);printf("是字符串结束符吗?%d\n", input == '\0');printf("是换行符吗?%d\n", input == "\n");printf("是空字符串吗?%d\n", input == "");}返回0;}
这是我与程序的交互:
$ ./a.out这是一些字符串这是作为字符串的输入:这是一些字符串它是字符串结束字符吗?0它是换行符吗?0它是空字符串吗?0这是作为字符串的输入:它是字符串结束字符吗?0它是换行符吗?0它是空字符串吗?0
第二个块真的很有趣,当我按下的只是输入时.在那种情况下 input
到底是什么?这似乎不是我的任何猜测:\0
或 \n
或 ""
.
gets
描述中的这部分可能会引起混淆:
它需要所有字符直到(但不包括)换行符
最好说它需要所有字符包括换行符但存储所有字符不包括强> 换行符.
因此,如果用户输入some string
,gets
函数将从用户终端读取some string
和换行符,但存储缓冲区中只有 some string
- 换行符丢失.这很好,因为无论如何都没有人想要换行符 - 它是一个控制字符,而不是用户想要输入的数据的一部分.
因此,如果您只按enter,gets
会将其解释为空字符串.现在,正如一些人所指出的,您的代码有多个错误.
printf("这是输入的字符串:%s\n", input);
这里没问题,尽管您可能想用一些人工字符分隔字符串以进行更好的调试:
<块引用>printf("这是输入的字符串:'%s'\n", input);
<小时><块引用>
printf("是字符串结束符吗?%d\n", input == '\0');
不好:您想在这里检查 1 个字节,而不是整个缓冲区.如果您尝试将整个缓冲区与 0 进行比较,则答案始终为 false
,因为编译器会将 \0
转换为 NULL
并将比较结果解释为缓冲区是否存在?".
正确的做法是:
<块引用>printf("第一个字节是否包含字符串结束符?%d\n", input[0] == '\0');
这仅将 1 个字节与 \0
进行比较.
printf("是换行符吗?%d\n", input == "\n");
不好:这将缓冲区的地址与 "\n"
的地址进行比较 - 答案总是 false
.在 C 中比较字符串的正确方法是 strcmp
:
printf("是换行符吗?%d\n", strcmp(input, "\n") == 0);
注意特殊用法:strcmp
当字符串相等时返回 0.
printf("是空字符串吗?%d\n", input == "");
这里也有同样的错误.在这里也使用 strcmp
:
printf("是空字符串吗?%d\n", strcmp(input, "") == 0);
<小时>
顺便说一句,正如人们常说的那样,gets
不能以安全的方式使用,因为它不支持缓冲区溢出保护.所以你应该使用 fgets
代替,即使它是不太方便:
字符输入[100];while (fgets(input, sizeof input, stdin)){...}
这可能会导致混淆:fgets
不会从它读取的输入中删除换行字节.因此,如果您将代码中的 gets
替换为 fgets
,您将得到不同的结果.幸运的是,您的代码将以清晰的方式说明差异.
Here's the description of gets()
from Prata's C Primer Plus:
It gets a string from your system's standard input device, normally your keyboard. Because a string has no predetermined length,
gets()
needs a way to know when to stop. Its method is to read characters until it reaches a newline (\n
) character, which you generate by pressing the Enter key. It takes all the characters up to (but not including) the newline, tacks on a null character (\0
), and gives the string to the calling program.
It got my curious as to what would happen when gets()
reads in just a newline. So I wrote this:
int main(void)
{
char input[100];
while(gets(input))
{
printf("This is the input as a string: %s\n", input);
printf("Is it the string end character? %d\n", input == '\0');
printf("Is it a newline string? %d\n", input == "\n");
printf("Is it the empty string? %d\n", input == "");
}
return 0;
}
Here's my interaction with the program:
$ ./a.out
This is some string
This is the input as a string: This is some string
Is it the string end character? 0
Is it a newline string? 0
Is it the empty string? 0
This is the input as a string:
Is it the string end character? 0
Is it a newline string? 0
Is it the empty string? 0
The second block is really the thing of interest, when all I press is enter. What exactly is input
in that case? It doesn't seem to be any of my guesses of: \0
or \n
or ""
.
This part in the description of gets
might be confusing:
It takes all the characters up to (but not including) the newline
It might be better to say that it takes all the characters including the newline but stores all characters not including the newline.
So if the user enters some string
, the gets
function will read some string
and the newline character from the user's terminal, but store only some string
in the buffer - the newline character is lost. This is good, because no one wants the newline character anyway - it's a control character, not a part of the data that user wanted to enter.
Therefore, if you only press enter, gets
interprets it as an empty string. Now, as noted by some people, your code has multiple bugs.
printf("This is the input as a string: %s\n", input);
No problem here, though you might want to delimit your string by some artificial characters for better debugging:
printf("This is the input as a string: '%s'\n", input);
printf("Is it the string end character? %d\n", input == '\0');
Not good: you want to check 1 byte here, not the whole buffer. If you try to compare the whole buffer with 0, the answer is always false
because the compiler converts \0
to NULL
and interprets the comparison like "does the buffer exist at all?".
The right way is:
printf("Does the first byte contain the string end character? %d\n", input[0] == '\0');
This compares just 1 byte to \0
.
printf("Is it a newline string? %d\n", input == "\n");
Not good: this compares the address of the buffer with the address of "\n"
- the answer is always false
. The right way to compare string in C is strcmp
:
printf("Is it a newline string? %d\n", strcmp(input, "\n") == 0);
Note the peculiar usage: strcmp
returns 0 when the strings are equal.
printf("Is it the empty string? %d\n", input == "");
The same bug here. Use strcmp
here too:
printf("Is it the empty string? %d\n", strcmp(input, "") == 0);
BTW as people always say, gets
cannot be used in a secure way, because it doesn't support protection from buffer overflow. So you should use fgets
instead, even though it's less convenient:
char input[100];
while (fgets(input, sizeof input, stdin))
{
...
}
This leads to possible confusion: fgets
doesn't delete the newline byte from the input it reads. So if you replace gets
in your code by fgets
, you will get different results. Fortunately, your code will illustrate the difference in a clear way.
这篇关于当它只读取一个换行符时,gets() 会保存什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!