正确使用 sscanf [英] Properly using sscanf

查看:25
本文介绍了正确使用 sscanf的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我应该得到一个可以是以下任何格式的输入行:

I am supposed to get an input line that can be in of any of the following formats:

  • 单词 1 和单词 2 之间必须有空格.
  • 单词 2 和单词 3 之间必须有逗号.
  • 单词 2 和单词 3 之间不一定要有空格,但可以有任意数量的空格.

如何将 1、2、3 个单词的大小写分开并将数据放入正确的变量中?

How can I separate 1, 2 and 3 word cases and put the data into the correct variables?

word1
word1 word2 
word1 word2 , word3
word1 word2,word3

我想过类似的事情:

sscanf("string", "%s %s,%s", word1, word2, word3);

但它似乎不起作用.

我使用严格的 C89.

I use strict C89.

推荐答案

int n = sscanf("string", "%s %[^, ]%*[, ]%s", word1, word2, word3);

n 中的返回值告诉您成功进行了多少次分配.%[^, ] 是一个否定的字符类匹配,用于查找不包含逗号或空格的单词(如果您愿意,可以添加制表符).%*[, ] 是找到逗号或空格但禁止赋值的匹配项.

The return value in n tells you how many assignments were made successfully. The %[^, ] is a negated character-class match that finds a word not including either commas or blanks (add tabs if you like). The %*[, ] is a match that finds a comma or space but suppresses the assignment.

我不确定我会在实践中使用它,但它应该可以工作.但是,它未经测试.

I'm not sure I'd use this in practice, but it should work. It is, however, untested.

也许更严格的规范是:

int n = sscanf("string", "%s %[^, ]%*[,]%s", word1, word2, word3);

区别在于非赋值字符类只接受逗号.sscanf()word2 之后的任何空格(或 EOS,字符串结尾)处停止,并在分配给 word3 之前跳过空格.以前的版本允许在第二个和第三个单词之间使用空格代替逗号,但问题并没有严格允许.

The difference is that the non-assigning character class only accepts a comma. sscanf() stops at any space (or EOS, end of string) after word2, and skips spaces before assigning to word3. The previous edition allowed a space between the second and third words in lieu of a comma, which the question does not strictly allow.

正如 pmg 在评论中建议的那样,分配转换规范应指定一个长度以防止缓冲区溢出.请注意,长度不包括空终止符,因此格式字符串中的值必须比以字节为单位的数组大小小 1.另请注意,虽然 printf() 允许您使用 * 动态指定大小,sscanf() 等使用 *> 禁止赋值.这意味着您必须专门为手头的任务创建字符串:

As pmg suggests in a comment, the assigning conversion specifications should be given a length to prevent buffer overflow. Note that the length does not include the null terminator, so the value in the format string must be one less than the size of the arrays in bytes. Also note that whereas printf() allows you to specify sizes dynamically with *, sscanf() et al use * to suppress assignment. That means you have to create the string specifically for the task at hand:

char word1[20], word2[32], word3[64];
int n = sscanf("string", "%19s %31[^, ]%*[,]%63s", word1, word2, word3);

(Kernighan & Pike 建议在他们的(优秀)书中动态格式化格式字符串 '编程实践' 或亚马逊编程实践 1999.)

(Kernighan & Pike suggest formatting the format string dynamically in their (excellent) book 'The Practice of Programming' or Amazon The Practice of Programming 1999.)

刚刚发现一个问题:给定"word1 word2 ,word3",它不读取word3.有治愈方法吗?

Just found a problem: given "word1 word2 ,word3", it doesn't read word3. Is there a cure?

是的,有一种治愈方法,而且它实际上也很微不足道.在格式字符串中的非分配、逗号匹配转换规范之前添加一个空格.因此:

Yes, there's a cure, and it is actually trivial, too. Add a space in the format string before the non-assigning, comma-matching conversion specification. Thus:

#include <stdio.h>

static void tester(const char *data)
{
    char word1[20], word2[32], word3[64];
    int n = sscanf(data, "%19s %31[^, ] %*[,]%63s", word1, word2, word3);
    printf("Test data: <<%s>>
", data);
    printf("n = %d; w1 = <<%s>>, w2 = <<%s>>, w3 = <<%s>>
", n, word1, word2, word3);
}

int main(void)
{
    const char *data[] =
    {
        "word1 word2 , word3",
        "word1 word2 ,word3",
        "word1 word2, word3",
        "word1 word2,word3",
        "word1 word2       ,       word3",
    };
    enum { DATA_SIZE = sizeof(data)/sizeof(data[0]) };
    size_t i;
    for (i = 0; i < DATA_SIZE; i++)
        tester(data[i]);
    return(0);
}

示例输出:

Test data: <<word1 word2 , word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2 ,word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2, word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2,word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>
Test data: <<word1 word2       ,       word3>>
n = 3; w1 = <<word1>>, w2 = <<word2>>, w3 = <<word3>>

<小时>

一旦非分配字符类"只接受逗号,您可以将其缩写为格式字符串中的文字逗号:


Once the 'non-assigning character class' only accepts a comma, you can abbreviate that to a literal comma in the format string:

int n = sscanf(data, "%19s %31[^, ] , %63s", word1, word2, word3);

将其插入测试工具会产生与之前相同的结果.请注意,所有代码都受益于审查;即使在它工作之后,它也经常(基本上总是)得到改进.

Plugging that into the test harness produces the same result as before. Note that all code benefits from review; it can often (essentially always) be improved even after it is working.

这篇关于正确使用 sscanf的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆