%ms 和 %s scanf 之间的差异 [英] difference between %ms and %s scanf

查看:32
本文介绍了%ms 和 %s scanf 之间的差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

阅读 scanf 手册我遇到了这一行:

Reading the scanf manual I encounter this line:

一个可选的m"字符.这与字符串转换一起使用(%s, %c, %[),

An optional 'm' character. This is used with string conversions (%s, %c, %[),

有人可以用简单的例子来解释它,说明在某些情况下这种选项的区别和需要吗?

Can someone explain it with simple example stating the difference and the need of such an option in some cases?

推荐答案

C 标准没有在 scanf() 格式中定义这样的可选字符.

The C Standard does not define such an optional character in the scanf() formats.

GNU lib C 确实以这种方式定义了一个可选的 a 指示符(来自 scanf 的手册页):

The GNU lib C, does define an optional a indicator this way (from the man page for scanf):

一个可选的 a 字符.这与字符串转换一起使用,并使调用者无需分配相应的缓冲区来保存输入:相反,scanf() 分配一个足够大小的缓冲区,并分配此缓冲区的地址对应的指针参数,应该是指向char *变量的指针(这个变量在调用前不需要初始化)

An optional a character. This is used with string conversions, and relieves the caller of the need to allocate a corresponding buffer to hold the input: instead, scanf() allocates a buffer of sufficient size, and assigns the address of this buffer to the corresponding pointer argument, which should be a pointer to a char * variable (this variable does not need to be initialized before the call).

调用者随后应该在不再需要时释放这个缓冲区.这是一个 GNU 扩展;C99 使用 a 字符作为转换说明符(在 GNU 实现中也可以这样使用).

The caller should subsequently free this buffer when it is no longer required. This is a GNU extension; C99 employs the a character as a conversion specifier (and it can also be used as such in the GNU implementation).

手册页的注意部分说:

如果程序是用 gcc -std=c99gcc -D_ISOC99_SOURCE 编译的,则 a 修饰符不可用(除非 _GNU_SOURCE 也被指定),在这种情况下,a 被解释为浮点数的说明符(见上文).

The a modifier is not available if the program is compiled with gcc -std=c99 or gcc -D_ISOC99_SOURCE (unless _GNU_SOURCE is also specified), in which case the a is interpreted as a specifier for floating-point numbers (see above).

从 2.7 版开始,glibc 还提供了 m 修饰符,其目的与 a 修饰符相同.m 修饰符具有以下优点:

Since version 2.7, glibc also provides the m modifier for the same purpose as the a modifier. The m modifier has the following advantages:

  • 它也可以应用于 %c 转换说明符(例如,%3mc).

  • It may also be applied to %c conversion specifiers (e.g., %3mc).

它避免了 %a 浮点转换说明符的歧义(并且不受 gcc -std=c99 等的影响)

It avoids ambiguity with respect to the %a floating-point conversion specifier (and is unaffected by gcc -std=c99 etc.)

它在即将到来的 POSIX.1 标准修订版中有所规定.

It is specified in the upcoming revision of the POSIX.1 standard.

在线 linux 手册页位于 http://linux.die.net/man/3/scanf 仅将此选项记录为:

The online linux manual page at http://linux.die.net/man/3/scanf only documents this option as:

一个可选的m"字符.这与字符串转换(%s%c%[)一起使用,并使调用者无需分配相应的缓冲区保存输入:相反,scanf() 分配一个足够大小的缓冲区,并将该缓冲区的地址分配给相应的指针参数,该参数应该是一个指向 char *<的指针/code> 变量(这个变量不需要在调用前初始化).当不再需要时,调用者应该随后free(3)这个缓冲区.

An optional 'm' character. This is used with string conversions (%s, %c, %[), and relieves the caller of the need to allocate a corresponding buffer to hold the input: instead, scanf() allocates a buffer of sufficient size, and assigns the address of this buffer to the corresponding pointer argument, which should be a pointer to a char * variable (this variable does not need to be initialized before the call). The caller should subsequently free(3) this buffer when it is no longer required.

Posix 标准在其 POSIX.1-2008 版本中记录了此扩展(参见 http://pubs.opengroup.org/onlinepubs/9699919799/functions/fscanf.html ):

The Posix standard documents this extension in its POSIX.1-2008 edition (see http://pubs.opengroup.org/onlinepubs/9699919799/functions/fscanf.html ):

%c%s%[ 转换说明符应接受可选的赋值分配字符 m,这将导致分配一个内存缓冲区来保存转换后的字符串,包括终止空字符.在这种情况下,与转换说明符对应的参数应该是对指针变量的引用,该指针变量将接收指向已分配缓冲区的指针.系统应分配一个缓冲区,就像 malloc() 已被调用一样.应用程序负责释放使用后的内存.如果没有足够的内存来分配缓冲区,则该函数应将 errno 设置为 [ENOMEM] 并导致转换错误.如果函数返回EOF,则在函数返回之前,应释放通过此调用成功使用赋值分配字符m 为参数分配的任何内存.

The %c, %s, and %[ conversion specifiers shall accept an optional assignment-allocation character m, which shall cause a memory buffer to be allocated to hold the string converted including a terminating null character. In such a case, the argument corresponding to the conversion specifier should be a reference to a pointer variable that will receive a pointer to the allocated buffer. The system shall allocate a buffer as if malloc() had been called. The application shall be responsible for freeing the memory after usage. If there is insufficient memory to allocate a buffer, the function shall set errno to [ENOMEM] and a conversion error shall result. If the function returns EOF, any memory successfully allocated for parameters using assignment-allocation character m by this call shall be freed before the function returns.

使用这个扩展,你可以写:

Using this extension, you could write:

char *p;
scanf("%ms", &p);

导致 scanf 从标准输入中解析一个单词并分配足够的内存来存储它的字符加上一个终止 ''.指向已分配数组的指针将存储到 p 中,并且 scanf() 将返回 1,除非无法从 p 中读取非空白字符代码>标准输入.

Causing scanf to parse a word from standard input and allocate enough memory to store its characters plus a terminating ''. A pointer to the allocated array would be stored into p and scanf() would return 1, unless no non whitespace characters can be read from stdin.

其他系统完全有可能将 m 用于类似的语义或完全用于其他用途.非标准扩展是不可移植的,在标准方法繁琐不切实际或完全不可能的情况下,应非常小心地使用并记录在案.

It is entirely possible that other systems use m for similar semantics or for something else entirely. Non-standard extensions are non portable and should be used very carefully, documented as such, in circumstances where a standard approach is cumbersome impractical or altogether impossible.

请注意,使用标准版本的 scanf() 确实无法解析任意大小的单词:

Note that parsing a word of arbitrary size is indeed impossible with the standard version of scanf():

您可以解析具有最大大小的单词,并且应该在 '' 之前指定要存储的最大字符数:

You can parse a word with a maximum size and should specify the maximum number of characters to store before the '':

char buffer[20];
scanf("%19s", buffer);

但这并不能告诉您在标准输入中还有多少字符可以解析.在任何情况下,如果输入足够长,不传递最大字符数可能会引发未定义行为,并且攻击者甚至可能使用特制输入来破坏您的程序:

But this does not tell you how many more characters are available to parse in standard input. In any case, not passing the maximum number of characters may invoke undefined behavior if the input is long enough, and specially crafted input may even be used by an attacker to compromise your program:

char buffer[20];
scanf("%s", buffer); // potential undefined behavior,
                     // that could be exploited by an attacker.

这篇关于%ms 和 %s scanf 之间的差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆