C ++中的字符串标记化包括定界符字符 [英] string tokenization in C++ including delimiter characters

查看:228
本文介绍了C ++中的字符串标记化包括定界符字符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下形式的字符串
a = x + y abc = xyz + 5 6 + 5 f(p)

I have strings of following form a = x + y or abc = xyz + 5 or 6 + 5 or f(p)

我需要的是标记字符串,以便我读每个运算符操作数
所以对于 a = x + y 令牌返回应为 a,=,x,+,y c> abc = xyz + 5 它应该返回 abc,=,xyz,+,5 。请注意运算符操作数之间可能有或可能没有空格

What i need is to tokenize the string such that I read each operator and operand so for a = x + y tokens returns should be a,=,x,+,y and in case of abc=xyz+5 it should return abc,=,xyz,+,5. please note that there may or may not be spaces between operator and operands

这是我尝试过的

void tokenize(std::vector<std::string>& tokens, const char* input, const char* delimiters) {
    const char* s = input;
    const char* e = s;
    while (*e != 0) {
        e = s;
        while (*e != 0 && strchr(delimiters, *e) == 0) {
            ++e;
        }
        if ( *e != ' ' && strchr(delimiters, *e) != 0 ){
            std::string op = "";
            op += *e;
            tokens.push_back(op);
        }
        if (e - s > 0) {
            tokens.push_back(std::string(s,e - s));
        }
        s = e + 1;
    }
}


推荐答案

可以使用这个实现。
第一个参数是要标记的std :: string,第二个参数是要使用的分隔符。它返回一个字符串向量的符号化。非常简单而高效。

You can use this implementation. First argument is the std::string you want to tokenize, second argument is the delimiter you want to use. It returns a vector of strings tokenized. Very simple yet efficient.

vector<string> tokenizeString(const string& str, const string& delimiters)
{  
   vector<string> tokens;
   // Skip delimiters at beginning.
   string::size_type lastPos = str.find_first_not_of(delimiters, 0);
   // Find first "non-delimiter".
   string::size_type pos = str.find_first_of(delimiters, lastPos);

   while (string::npos != pos || string::npos != lastPos)
    {  // Found a token, add it to the vector.
      tokens.push_back(str.substr(lastPos, pos - lastPos));
      // Skip delimiters.  Note the "not_of"
      lastPos = str.find_first_not_of(delimiters, pos);
      // Find next "non-delimiter"
      pos = str.find_first_of(delimiters, lastPos);
   }
    return tokens;
}

这篇关于C ++中的字符串标记化包括定界符字符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆