C ++ ::的boost ::正则表达式迭代子匹配 [英] C++::Boost::Regex Iterate over the submatches

查看:125
本文介绍了C ++ ::的boost ::正则表达式迭代子匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的命名捕获组与升压正则表达式/ X pressive。

我想遍历所有子匹配,并得到双方的价值和每个子匹配的密钥(即什么[型])。

  sregex模式= sregex ​​::编译(?(P<类型> HREF | SRC)= \\?(P< URL> [^ \\] +)\\ );sregex_iterator CUR(web_buffer.begin(),web_buffer.end(),图案);
sregex_iterator结束;对于(; CUR =结束;!++ CUR){
    SMATCH常量和放大器;什么= * CUR;    //我知道如何使用字符串键来访问:什么[型]
    性病::法院LT&;<什么[0]&下;&下; [<<什么[型]<< ] [<<什么[URL]<<]<<的std :: ENDL;    / *我知道如何遍历,使用整数关键,但我会
      喜欢也得到了原始密钥到一个变量,即
      在什么[1],同时获得的价值和型的情况下
    * /
    对于(i = 0; I< what.size();我++){
        性病::法院LT&;< {} = [<<什么[1] - ;&下; ]<<的std :: ENDL;
    }    性病::法院LT&;<的std :: ENDL;
}


解决方案

通过升压1.54.0,这是更加困难,因为捕获的名字甚至没有存储在结果。相反,提振刚散列捕捉名称和存储的哈希(一个 INT )和相​​关指针原始字符串。

我已经写在的boost :: SMATCH 派生的小班,节省捕获的名字,并提供一个iterator他们。

 类namesaving_smatch:公共SMATCH
{
上市:
    namesaving_smatch(常量正则表达式和放大器;图案)
    {
        标准::字符串pattern_str = pattern.str();
        正则表达式capture_pattern(\\\\ P<?(\\\\ W +)>);
        自动words_begin = sregex_iterator(pattern_str.begin(),pattern_str.end(),capture_pattern);
        汽车words_end = sregex_iterator();        对于(sregex_iterator I = words_begin; I = words_end;!我++)
        {
            性病::字符串名称=(* I)[1]名为.str();
            m_names.push_back(名);
        }
    }    〜namesaving_smatch(){}    的std ::矢量<标准::字符串> ::为const_iterator names_begin()const的
    {
        返回m_names.begin();
    }    的std ::矢量<标准::字符串> ::为const_iterator names_end()const的
    {
        返回m_names.end();
    }私人的:
    的std ::矢量<标准::字符串> m_names;
};

类接受包含在其构造命名捕获组常规前pression。使用类,像这样:

  namesaving_smatch结果(重新);
如果(regex_search(输入,结果,RE))
    为(自动它= results.names_begin(!);它= results.names_end(); ++吧)
        COUT<< *它<< :&所述;&下;结果[*吧]名为.str();

I am using Named Capture Groups with Boost Regex / Xpressive.

I would like to iterate over all submatches, and get both the value and KEY of each submatch (i.e. what["type"]).

sregex pattern = sregex::compile(  "(?P<type>href|src)=\"(?P<url>[^\"]+)\""    );

sregex_iterator cur( web_buffer.begin(), web_buffer.end(), pattern );
sregex_iterator end;

for( ; cur != end; ++cur ){
    smatch const &what = *cur;

    //I know how to access using a string key: what["type"]
    std::cout << what[0] << " [" << what["type"] << "] [" << what["url"] <<"]"<< std::endl;

    /*I know how to iterate, using an integer key, but I would
      like to also get the original KEY into a variable, i.e.
      in case of what[1], get both the value AND "type"
    */
    for(i=0; i<what.size(); i++){
        std::cout << "{} = [" << what[i] << "]" << std::endl;
    }

    std::cout << std::endl;
}

解决方案

With Boost 1.54.0 this is even more difficult because the capture names are not even stored in the results. Instead, Boost just hashes the capture names and stores the hash (an int) and the associated pointers to the original string.

I've written a small class derived from boost::smatch that saves capture names and provides an iterator for them.

class namesaving_smatch : public smatch
{
public:
    namesaving_smatch(const regex& pattern)
    {
        std::string pattern_str = pattern.str();
        regex capture_pattern("\\?P?<(\\w+)>");
        auto words_begin = sregex_iterator(pattern_str.begin(), pattern_str.end(), capture_pattern);
        auto words_end = sregex_iterator();

        for (sregex_iterator i = words_begin; i != words_end; i++)
        {
            std::string name = (*i)[1].str();
            m_names.push_back(name);
        }
    }

    ~namesaving_smatch() { }

    std::vector<std::string>::const_iterator names_begin() const
    {
        return m_names.begin();
    }

    std::vector<std::string>::const_iterator names_end() const
    {
        return m_names.end();
    }

private:
    std::vector<std::string> m_names;
};

The class accepts the regular expression containing the named capture groups in its constructor. Use the class like so:

namesaving_smatch results(re);
if (regex_search(input, results, re))
    for (auto it = results.names_begin(); it != results.names_end(); ++it)
        cout << *it << ": " << results[*it].str();

这篇关于C ++ ::的boost ::正则表达式迭代子匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆