如何读取/解析输入用C?常见问题解答 [英] How to read / parse input in C? The FAQ
问题描述
我有我的C程序问题,当我尝试读取/解析输入。
I have problems with my C program when I try to read / parse input.
帮助?
这是一个常见问题。
计算器具有的许多的与读取输入C,与通常在特定用户的特定问题集中没有真正画的全貌答案的问题。
StackOverflow has many questions related to reading input in C, with answers usually focussed on the specific problem of that particular user without really painting the whole picture.
这是讨论该问题的COM prehensively一种尝试,要特别注意常见的错误,所以大部分这些问题可以通过将它们标记为这一次的副本很简单地回答:
This is an attempt to cover the subject comprehensively, with special attention to common mistakes, so most of those questions can be answered simply by marking them as duplicates of this one:
- 为什么最后一行打印两次?
- 为什么我的
scanf函数(%d个,...)
/scanf函数(%C,...)
失败? - 为什么
获得()
崩溃? - ...
- Why does the last line print twice?
- Why does my
scanf("%d", ...)
/scanf("%c", ...)
fail? - Why does
gets()
crash? - ...
答案被标记为社区维基。随意提高和(谨慎地)延伸。
The answer is marked as community wiki. Feel free to improve and (cautiously) extend.
推荐答案
- 文本模式与二进制模式
- 检查fopen()函数失败
- 陷阱
- 检查你调用成功任何功能
- EOF,或者为什么最后一行打印两次
- 请不要使用gets(),曾经
- 请不要使用* scanf()的潜在恶意输入
- 当* scanf()函数没有按预期
- 阅读(部分),通过与fgets输入的一行()
- 解析行内存
一个二进制模式流中准确读,因为它已被写入。然而,有可能(也可能不会)是空字符(
\\ 0
')在流的末尾。A "binary mode" stream is read in exactly as it has been written. However, there might (or might not) be an implementation-defined number of null characters ('
\0
') appended at the end of the stream.一个文本模式流可以做许多的变换,包括(但不限于):
A "text mode" stream may do a number of transformations, including (but not limited to):
- 立即行结束前的空格去除;
- 改变换行符(
的'\\ n'
),以输出别的东西(如\\ r \\ n
在Windows上),并返回到的'\\ n'
输入; - 添加,更改或删除既不是打印字符的字符(
isprint判断(三)==真
),水平制表符或新行。
- removal of spaces immediately before a line-end;
- changing newlines (
'\n'
) to something else on output (e.g."\r\n"
on Windows) and back to'\n'
on input; - adding, altering, or deleting characters that are neither printing characters (
isprint( c ) == true
), horizontal tabs, or new-lines.
这应该是显而易见的文本和二进制模式不混合。在文本模式下打开文本文件,并以二进制模式二进制文件。
It should be obvious that text and binary mode do not mix. Open text files in text mode, and binary files in binary mode.
要打开文件的尝试可能会失败,因为各种原因 - 缺少权限,或找不到文件是最常见的。在这种情况下,
fopen()函数
将返回NULL
指针。The attempt to open a file may fail for various reasons -- lack of permissions, or file not found being the most common ones. In this case,
fopen()
will return aNULL
pointer.它的可以的设置全局
错误号
变量,其中可利用变成一个纯文本错误信息的价值PERROR()
;这是POSIX,而不是C语言的要求,因此它可能不是每一个平台上工作。It may set the global
errno
variable, the value of which can be turned into a plain-text error message usingperror()
; this is a requirement by POSIX, not the C language, so it may not work on every platform.#include <stdio.h> #include <errno.h> int main() { errno = 0; FILE * fp = fopen( "file.txt", "rb" ); if ( fp != NULL ) { // ready to read } else { // If supported by fopen(), will print a message // *why* fopen() failed, exactly. perror( "fopen() failed" ); } fclose( fp ); }
缺陷
检查您拨打成功的功能
这应该是显而易见的。但是的不的检查任何函数调用他们的返回值和错误处理,该文档的检查的那些条件。
This should be obvious. But do check the documentation of any function you call for their return value and error handling, and check for those conditions.
这是当你及早发现病情,很容易失误,反而导致大量的头划伤的,如果你不知道。
These are errors that are easy when you catch the condition early, but lead to lots of head-scratching if you do not.
EOF,或者为什么最后一行打印两次
功能
的feof(FILE *流)
收益真正
如果EOF已经达到。什么达到EOF其实就是一个误会让很多初学者写的东西是这样的:The function
feof( FILE * stream )
returnstrue
if EOF has been reached. A misunderstanding of what "reaching" EOF actually means makes many beginners write something like this:// BROKEN CODE while ( ! feof( fp ) ) { fgets( buffer, BUFFER_SIZE, fp ); puts( buffer ); }
此使输入的打印的的最后一行两次的,因为当最后一行被读出(高达最终换行,在输入流中的最后一个字符), EOF为< STRONG>不可以设置。的
This makes the last line of the input print twice, because when the last line is read (up to the final newline, the last character in the input stream), EOF is not set.
EOF当您试图只读被设置的过去的最后一个字符!
EOF only gets set when you attempt to read past the last character!
所以上面的code循环一次,
与fgets()
无法读取另一条线,设置EOF的和叶<$ C $的内容C>缓存触及的,然后把它再次打印。So the code above loops once more,
fgets()
fails to read another line, sets EOF and leaves the contents ofbuffer
untouched, which then gets printed again.所以,检查
的feof()
的之后的读取,但是的前的处理:So, check for
feof()
after the read, but before processing:// GOOD CODE while ( fgets( buffer, BUFFER_SIZE, fp ) != NULL ) { puts( buffer ); }
不要使用
获得()
,永远Do not use
gets()
, ever有没有办法安全地使用此功能。正因为如此,它一直的删除从C11的出现的语言。
There is no way to use this function safely. Because of this, it has been removed from the language with the advent of C11.
不要使用
* scanf()的
潜在的恶意输入Do not use
*scanf()
for potentially malformed input很多教程教你使用
* scanf()的
读取任何类型的输入,因为它是如此多才多艺。Many tutorials teach you to use
*scanf()
for reading any kind of input, because it is so versatile.但目的
* scanf()的
真是看,可以稍微的依赖的时在pdefined一个$ P $是批量数据格式。 (如被写的另一个程序。)But the purpose of
*scanf()
is really to read bulk data that can be somewhat relied upon being in a predefined format. (Such as being written by another program.)即使这样
* scanf()的
并触发眛:- 使用格式字符串,在某种程度上可以由用户的影响是巨大的安全漏洞。
- 如果输入不符合预期的格式,
* scanf()的
立即停止解析,留下任何剩余的参数初始化。 - 它会告诉你有多少的分配的已成功完成,但不准确的地方才停分析输入,使优美的错误恢复困难。
- 它会跳过输入,任何前导空格时,它没有(
[
,C
,以及除N
转换)。 (见下段)。 - 它在某个角落情况下,有些奇特的行为。
- Using a format string that in some way can be influenced by the user is a gaping security hole.
- If the input does not match the expected format,
*scanf()
immediately stops parsing, leaving any remaining arguments uninitialized. - It will tell you how many assignments it has successfully done, but not where exactly it stopped parsing the input, making graceful error recovery difficult.
- It skips any leading whitespaces in the input, except when it does not (
[
,c
, andn
conversions). (See next paragraph.) - It has somewhat peculiar behaviour in some corner cases.
在
* scanf()的
如预期不起作用When
*scanf()
does not work as expected一个常见的问题
* scanf()的
是当有未读的空白(,
的'\\ n'
,...)在输入流中,用户没有考虑。A frequent problem with
*scanf()
is when there is an unread whitespace (' '
,'\n'
, ...) in the input stream that the user did not account for.阅读一些(
%D
等),或者一个字符串(%S
),停止在任何空白。虽然大部分的* scanf()的
转换指定的跳过的前导空格中输入[
,C
和N
没有。所以,换行仍然是一个未完成的输入字符,使得无论是%C
和%[
不匹配。Reading a number (
"%d"
et al.), or a string ("%s"
), stops at any whitespace. And while most*scanf()
conversion specifiers skip leading whitespace in the input,[
,c
andn
do not. So the newline is still the first pending input character, making either%c
and%[
fail to match.您可以跳过输入换行符,通过明确地阅读它例如通过
龟etc()
,或通过添加一个空格,在* scanf()的
格式字符串。 (格式字符串匹配单个空格的任何的输入空格的数量。)You can skip over the newline in the input, by explicitly reading it e.g. via
fgetc()
, or by adding a whitespace to your*scanf()
format string. (A single whitespace in the format string matches any number of whitespace in the input.)然而,不要拨打
fflush()
您的输入流,如果你打算保持便携。这是明确只对POSIX平台;在纯C,调用fflush()
上的输入流是不确定的行为一>However, do not call
fflush()
on your input stream if you intend to remain portable. This is well-defined only for POSIX platforms; in plain C, callingfflush()
on an input stream is undefined behaviour.我们只是劝不要使用
* scanf()的
除非你真的,积极,知道自己在做什么。那么,是什么来作为替代品使用?We just adviced against using
*scanf()
except when you really, positively, know what you are doing. So, what to use as a replacement?而不是阅读和分析输入一气呵成,如
* scanf()的
试图这样做,分离步骤。Instead of reading and parsing the input in one go, as
*scanf()
attempts to do, separate the steps.阅读(部分)通过
输入行与fgets()
Read (part of) a line of input via
fgets()
与fgets()
具有限制其输入到至少很多字节,避免您的缓冲区溢出的参数。如果输入线并融入您的缓冲区完全,在缓冲区中的最后一个字符将换行符(的'\\ n'
)。如果不是,你正在寻找一个部分读线。fgets()
has a parameter for limiting its input to at least that many bytes, avoiding overflow of your buffer. If the input line did fit into your buffer completely, the last character in your buffer will be the newline ('\n'
). If it is not, you are looking at a partially-read line.解析行内存
特别有用的内存解析是与strtol()和 strtod()做功能的家庭,提供类似功能的
* scanf()的
转换说明D
,I
,U
,0
,X
,A
,电子
,˚F
和先按g
Especially useful for in-memory parsing are the strtol() and strtod() function families, which provide similar functionality to the
*scanf()
conversion specifiersd
,i
,u
,o
,x
,a
,e
,f
, andg
.但他们也告诉你的究竟的他们停止解析,并有数字太大,目标类型的有意义的处理。
But they also tell you exactly where they stopped parsing, and have meaningful handling of numbers too large for the target type.
除了这些,C提供广泛的字符串处理功能的。既然你已经在内存中的投入,始终准确地知道有多远你已经解析了,你可以走回去多少次你想使输入感。
Beyond those, C offers a wide range of string processing functions. Since you have the input in memory, and always know exactly how far you have parsed it already, you can walk back as many times you like trying to make sense of the input.
如果一切都失败了,您有可打印有用的错误消息,该用户的整条生产线。
And if all else fails, you have the whole line available to print a helpful error message for the user.
fclose( fp );
这篇关于如何读取/解析输入用C?常见问题解答的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!