标记字符串，接受CPP中给定字符集之间的所有内容 [英] tokenizing string , accepting everything between given set of characters in CPP

查看：82 发布时间：2020/9/22 4:57:30 c++ regex boost

本文介绍了标记字符串，接受CPP中给定字符集之间的所有内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下代码:

   int main()
{
  string s = "server ('m1.labs.teradata.com') username ('use\\')r_*5') password('u\" er 5') dbname ('default')";

    regex re("(\'[!-~]+\')");
    sregex_token_iterator i(s.begin(), s.end(), re, 1);
    sregex_token_iterator j;

    unsigned count = 0;
    while(i != j)
    {
        cout << "the token is  "<<*i<< endl;
        count++;
    }
    cout << "There were " << count << " tokens found." << endl;

  return 0;
}

使用上面的正则表达式，我想提取括号和单引号之间的字符串:，输出应该类似于:

Using the above regex, I wanted to extract the string between the paranthesis and single quote:, The out put should look like :

the token is   'm1.labs.teradata.com'
the token is   'use\')r_*5'
the token is   'u" er 5'
the token is   'default'
There were 4 tokens found.

基本上，正则表达式应该提取('"和')"之间的所有内容.它可以是任何空格，特殊字符，引号或右括号. 我以前使用过以下正则表达式:

Basically, the regex supposed to extract everything between " (' " and " ') ". It can be anything space , special character, quote or a closing parathesis. I has earlier used the following regex:

boost::regex re_arg_values("(\'[!-~]+\')");

但是不是接受空间.请有人帮我解决这个问题.预先感谢.

But is was not accepting space. Please can someone help me out with this. Thanks in advance.

推荐答案

以下是使用Spirit X3创建语法进行实际解析的示例.我想解析成一个(key-> value)对的映射，这比盲目地假设名称总是相同要有意义得多:

Here's a sample of using Spirit X3 to create grammar to actually parse this. I'd like to parse into a map of (key->value) pairs, which makes a lot more sense than just blindly assuming the names are always the same:

using Config = std::map<std::string, std::string>;
using Entry  = std::pair<std::string, std::string>;

现在，我们使用X3设置一些语法规则:

Now, we setup some grammar rules using X3:

namespace parser {
    using namespace boost::spirit::x3;

    auto value  = quoted("'") | quoted('"');
    auto key    = lexeme[+alpha];
    auto pair   = key >> '(' >> value >> ')';
    auto config = skip(space) [ *as<Entry>(pair) ];
}

助手as<>和quoted是简单的lambda:

The helpers as<> and quoted are simple lambdas:

template <typename T> auto as = [](auto p) { return rule<struct _, T> {} = p; };
auto quoted = [](auto q) { return lexeme[q >> *('\\' >> char_ | char_ - q) >> q]; };

现在我们可以将字符串直接解析为地图:

Now we can parse the string into a map directly:

Config parse_config(std::string const& cfg) {
    Config parsed;
    auto f = cfg.begin(), l = cfg.end();
    if (!parse(f, l, parser::config, parsed))
        throw std::invalid_argument("Parse failed at " + std::string(f,l));
    return parsed;
}

还有演示程序

int main() {
    Config cfg = parse_config("server ('m1.labs.teradata.com') username ('use\\')r_*5') password('u\" er 5') dbname ('default')");

    for (auto& setting : cfg)
        std::cout << "Key " << setting.first << " has value " << setting.second << "\n";
}

打印

Key dbname has value default
Key password has value u" er 5
Key server has value m1.labs.teradata.com
Key username has value use')r_*5

实时演示

在Coliru上直播

#include <iostream> #include <boost/spirit/home/x3.hpp> #include <boost/fusion/adapted/std_pair.hpp> #include <map> using Config = std::map<std::string, std::string>; using Entry = std::pair<std::string, std::string>; namespace parser { using namespace boost::spirit::x3; template <typename T> auto as = [](auto p) { return rule<struct _, T> {} = p; }; auto quoted = [](auto q) { return lexeme[q >> *(('\\' >> char_) | (char_ - q)) >> q]; }; auto value = quoted("'") | quoted('"'); auto key = lexeme[+alpha]; auto pair = key >> '(' >> value >> ')'; auto config = skip(space) [ *as<Entry>(pair) ]; } Config parse_config(std::string const& cfg) { Config parsed; auto f = cfg.begin(), l = cfg.end(); if (!parse(f, l, parser::config, parsed)) throw std::invalid_argument("Parse failed at " + std::string(f,l)); return parsed; } int main() { Config cfg = parse_config("server ('m1.labs.teradata.com') username ('use\\')r_*5') password('u\" er 5') dbname ('default')"); for (auto& setting : cfg) std::cout << "Key " << setting.first << " has value " << setting.second << "\n"; }

奖金

如果您想学习如何提取原始输入，只需尝试

Bonus

If you want to learn how to extract the raw input: just try

auto source = skip(space) [ *raw [ pair ] ];

就像这样:

using RawSettings = std::vector<std::string>; RawSettings parse_raw_config(std::string const& cfg) { RawSettings parsed; auto f = cfg.begin(), l = cfg.end(); if (!parse(f, l, parser::source, parsed)) throw std::invalid_argument("Parse failed at " + std::string(f,l)); return parsed; } int main() { for (auto& setting : parse_raw_config(text)) std::cout << "Raw: " << setting << "\n"; }

哪些印刷品: 在Coliru上直播

Raw: server ('m1.labs.teradata.com') Raw: username ('use\')r_*5') Raw: password('u" er 5') Raw: dbname ('default')

这篇关于标记字符串，接受CPP中给定字符集之间的所有内容的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

标记字符串，接受CPP中给定字符集之间的所有内容 [英] tokenizing string , accepting everything between given set of characters in CPP

问题描述

推荐答案

实时演示

奖金

Bonus

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

标记字符串，接受CPP中给定字符集之间的所有内容 [英] tokenizing string , accepting everything between given set of characters in CPP

问题描述

推荐答案

实时演示

奖金

Bonus

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭