奇怪的 strtok 行为 [英] strange strtok behaviour

查看:33
本文介绍了奇怪的 strtok 行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

char line[255];
char *token = NULL;
char *line2 = NULL;
char *temporaryToken = NULL;

if( scanf(" %[^\n]", line) > 0)
    token = strtok( line, ";" ); //divide the line by ;
    do
    {
        line2 = token;
        temporaryToken = strtok(line2, " ");
        do
        {
            //divide the line2 by spaces into command and args, not the question here]
            temporaryToken = strtok( NULL, " " );
        }while (temporaryToken != NULL );
        token = strtok( NULL, ";" );
    }while(token != NULL);

顺便说一下,这不是我的逐字代码,只是它如何设置的一个例子

this is not my code verbatim, by the way, just an example of how it's set out

在我的程序中,当我在第二次拆分之前打印token"变量时,它会打印出所有内容,直到 ;字符.

In my program, when I print the "token" variable before I split a second time, it'll print out everything until the ; character.

例如,假设 stdIn 接收ls -la; mkdir lololol; ls -la",它将打印ls -la".但是,在第二次拆分之后,打印token"只会打印ls".

For example, say stdIn took in "ls -la; mkdir lololol; ls -la", it would print "ls -la". But then, after the second split, printing "token" would only print "ls".

这是为什么,我该如何解决?

Why is this, and how could I go about fixing it?

推荐答案

strtok()有两个问题.

  1. 它修改了它的输入字符串.
  2. 一次只能激活一组 strtok() 调用.

我认为你的问题是后者.你的代码也有缩进问题:

I think your problem is the latter. You also have an indentation problem in the code:

if (scanf(" %[^\n]", line) > 0)
    token = strtok( line, ";" );
do
{
    line2 = token;
    temporaryToken = strtok(line2, " ");
    do
    {
        //divide the line2 by spaces into command and args, not the question here]
        temporaryToken = strtok(NULL, " ");
    } while (temporaryToken != NULL);
    token = strtok( NULL, ";" );
} while(token != NULL);

您可能打算阅读:

if (scanf(" %[^\n]", line) > 0)
{
    token = strtok(line, ";");
    do
    {
        line2 = token;
        temporaryToken = strtok(line2, " ");
        do
        {
            //divide the line2 by spaces into command and args, not the question here]
            temporaryToken = strtok(NULL, " ");
        } while (temporaryToken != NULL);
        token = strtok(NULL, ";");
    } while (token != NULL);
}

假设这是你想要的,你仍然有一个问题,在 line 上运行一个 strtok(),然后在 上运行第二个第 2 行.问题是,line2 上的循环完全破坏了line 的解释.您不能将嵌套循环与 strtok() 一起使用.

Assuming this is what you intended, you still have the problem that there is one strtok() running on line, and then a second one running on line2. The trouble is, the loop on line2 completely wrecks the interpretation of line. You can't use the nested loops with strtok().

如果你必须使用类似 strtok(),然后查找 POSIX strtok_r() 或 Microsoft 的 strtok_s()(但请注意,strtok_s() 的 C11 标准 Annex K 版本是不同的 — 参见 Do您使用 TR 24731 的安全"功能吗?).

If you must use something like strtok(), then look for either POSIX strtok_r() or Microsoft's strtok_s() (but note that the C11 standard Annex K version of strtok_s() is different — see Do you use the TR 24731 'safe' functions?).

if (scanf(" %[^\n]", line) > 0)
{
    char *end1;
    token = strtok_r(line, ";", &end1);
    do
    {
        char *end2;
        line2 = token;
        temporaryToken = strtok_r(line2, " ", &end2);
        do
        {
            //divide the line2 by spaces into command and args, not the question here]
            temporaryToken = strtok_r(NULL, " ", &end2);
        } while (temporaryToken != NULL);
        token = strtok_r(NULL, ";", &end1);
    } while (token != NULL);
}

<小时>

关于评论

当您使用 strtok() 或其亲属之一时,输入字符串将被修改,并且如果您有多个分隔符,您将无法分辨出现了哪个分隔符.您可以使用字符串的副本,并进行比较(通常基于距字符串开头的偏移量).


About the Comments

While you use strtok() or one of its relatives, the input string will be modified, and if you have multiple delimiters, you will not be able to tell which delimiter was present. You can work with a copy of the string, and do comparisons (usually based on offsets from the start of the string).

在使用 strtok_r() 的限制内,上述解决方案有效".下面是一个测试程序来演示:

Within the limits of using strtok_r(), the solution above 'works'. Here's a test program to demonstrate:

#include <stdio.h>
#include <string.h>

int main(void)
{
    char line[1024];

    if (scanf(" %[^\n]", line) > 0)
    {
        char *end1;
        char *token;
        printf("Input: <<%s>>\n", line);
        token = strtok_r(line, ";", &end1);
        do
        {
            char *end2;
            char *line2 = token;
            char *temporaryToken;
            printf("Token1: <<%s>>\n", token);
            temporaryToken = strtok_r(line2, " ", &end2);
            do
            {
                printf("Token2: <<%s>>\n", temporaryToken);
                //divide the line2 by spaces into command and args, not the question here]
                temporaryToken = strtok_r(NULL, " ", &end2);
            } while (temporaryToken != NULL);
            token = strtok_r(NULL, ";", &end1);
        } while (token != NULL);
    }

    return 0;
}

示例输入和输出:

$ ./strtok-demo
ls -la; mkdir lololol; ls -la
Input: <<ls -la; mkdir lololol; ls -la>>
Token1: <<ls -la>>
Token2: <<ls>>
Token2: <<-la>>
Token1: << mkdir lololol>>
Token2: <<mkdir>>
Token2: <<lololol>>
Token1: << ls -la>>
Token2: <<ls>>
Token2: <<-la>>
$

替代使用 strcspn()strspn()

如果您不想拆除原始字符串,则必须使用 strtok() 系列以外的其他函数.函数strcspn()strspn() 是合适的;它们是标准 C(C89 和更高版本)的一部分,尽管不如其他一些函数广为人知.但他们很适合这项任务.

Alternative using strcspn() and strspn()

If you don't want to demolish the original string, you must use other functions than the strtok() family. The functions strcspn() and strspn() are suitable; they are part of Standard C (C89 and later versions), albeit much less well known than some of the other functions. But they're spot on for this task.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static char *substrdup(const char *src, size_t len);

int main(void)
{
    char line[1024];

    if (scanf(" %[^\n]", line) > 0)
    {
        char *start1 = line;
        size_t len1;
        printf("Input: <<%s>>\n", line);
        while ((len1 = strcspn(start1, ";")) != 0)
        {
            char *copy = substrdup(start1, len1);
            char *start2 = copy;
            size_t len2;
            printf("Token1: %zd <<%.*s>>\n", len1, (int)len1, start1);
            printf("Copy: <<%s>>\n", copy);
            start2 += strspn(start2, " ");      // Skip leading white space
            while ((len2 = strcspn(start2, " ")) != 0)
            {
                printf("Token2: %zd <<%.*s>>\n", len2, (int)len2, start2);
                start2 += len2;
                start2 += strspn(start2, " ");
            }
            free(copy);
            start1 += len1;
            start1 += strspn(start1, ";");
        }
        printf("Check: <<%s>>\n", line);
    }

    return 0;
}

#include <assert.h>

static char *substrdup(const char *src, size_t len)
{
    char *copy = malloc(len+1);
    assert(copy != 0);              // Apalling error handling strategy
    memmove(copy, src, len);
    copy[len] = '\0';
    return(copy);
}

示例输入和输出:

$ strcspn-demo
ls -la; mkdir lololol; ls -la
Input: <<ls -la; mkdir lololol; ls -la>>
Token1: 140734970342872 <<>>
Copy: <<ls -la>>
Token2: 2 <<ls>>
Token2: 3 <<-la>>
Copy: << mkdir lololol>>
Token2: 5 <<mkdir>>
Token2: 7 <<lololol>>
Copy: << ls -la>>
Token2: 2 <<ls>>
Token2: 3 <<-la>>
Check: <<ls -la; mkdir lololol; ls -la>>
$

这段代码回到了更舒适的while循环,而不是需要使用do-while循环,这是一个好处.

This code goes back to the more comfortable while loop, rather than needing to use do-while loops, which is a benefit.

这篇关于奇怪的 strtok 行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆