升压正则表达式未正常工作在我的code [英] Boost regex not working as expected in my code

查看:185
本文介绍了升压正则表达式未正常工作在我的code的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚开始使用今天的boost ::正则表达式和普通防爆pressions是一个相当有新手了。我一直在使用的调​​节器和Ex preSSO来测试我的正则表达式,似乎满足于我所看到的有,但传递的正则表达式来推动,似乎并没有做我想做的事情。任何指针帮我一个解决方案将是最欢迎的。作为一个方面的问题是有任何工具,将帮我测试一下我的正则表达式对boost.regex?

I just started using Boost::regex today and am quite a novice in Regular Expressions too. I have been using "The Regulator" and Expresso to test my regex and seem satisfied with what I see there, but transferring that regex to boost, does not seem to do what I want it to do. Any pointers to help me a solution would be most welcome. As a side question are there any tools that would help me test my regex against boost.regex?

using namespace boost;
using namespace std;

vector<string> tokenizer::to_vector_int(const string s)
{
    regex re("\\d*");
    vector<string> vs;
    cmatch matches;
    if( regex_match(s.c_str(), matches, re) ) {
    	MessageBox(NULL, L"Hmmm", L"", MB_OK); // it never gets here
    	for( unsigned int i = 1 ; i < matches.size() ; ++i ) {
    		string match(matches[i].first, matches[i].second);
    		vs.push_back(match);
    	}
    }
    return vs;
}

void _uttokenizer::test_to_vector_int() 
{
    vector<string> __vi = tokenizer::to_vector_int("0<br/>1");
    for( int i = 0 ; i < __vi.size() ; ++i ) INFO(__vi[i]);
    CPPUNIT_ASSERT_EQUAL(2, (int)__vi.size());//always fails
}

更新(感谢DAV帮我澄清我的问题):
我希望得到一个向量在他们=>0和12串。我不是从来没有得到一个成功的regex_match()(regex_match()总是返回false),使矢量总是空的。

Update (Thanks to Dav for helping me clarify my question): I was hoping to get a vector with 2 strings in them => "0" and "1". I instead never get a successful regex_match() (regex_match() always returns false) so the vector is always empty.

由于1800信息您的建议。在 to_vector_int()方法现在看起来是这样,但它进入一个永无止境的循环(我拿了code你给并修改它,使其编译)和发现为0,,,等。它从来没有找到1。

Thanks '1800 INFORMATION' for your suggestions. The to_vector_int() method now looks like this, but it goes into a never ending loop (I took the code you gave and modified it to make it compilable) and find "0","","","" and so on. It never find the "1".

vector<string> tokenizer::to_vector_int(const string s)
{
    regex re("(\\d*)");
    vector<string> vs;

    cmatch matches;

    char * loc = const_cast<char *>(s.c_str());
    while( regex_search(loc, matches, re) ) {
    	vs.push_back(string(matches[0].first, matches[0].second));
    	loc = const_cast<char *>(matches.suffix().str().c_str());
    }

    return vs;
}

在所有诚实,我不认为我还有理解搜索的模式并获得比赛的基础知识。是否有举例说明这个问题的任何教程?

In all honesty I don't think I have still understood the basics of searching for a pattern and getting the matches. Are there any tutorials with examples that explains this?

推荐答案

最根本的问题是,你正在使用 regex_match 时,你应该使用 regex_search

The basic problem is that you are using regex_match when you should be using regex_search:

的算法regex_search和
  regex_match使用一个match_results的
  报告所匹配;区别
  这些算法之间是
  regex_match只找到匹配
  消耗所有的输入文字,
  其中,作为regex_search将为搜索
  在文本之中的任何地方比赛
  匹配。

The algorithms regex_search and regex_match make use of match_results to report what matched; the difference between these algorithms is that regex_match will only find matches that consume all of the input text, where as regex_search will search for a match anywhere within the text being matched.

<一个href=\"http://www.boost.org/doc/libs/1%5F39%5F0/libs/regex/doc/html/boost%5Fregex/introduction%5Fand%5Foverview.html\">From升压文档。将其更改为使用 regex_search ,它会正常工作。

From the boost documentation. Change it to use regex_search and it will work.

此外,它看起来像你没有捕捉比赛。尝试改变正则表达式是:

Also, it looks like you are not capturing the matches. Try changing the regex to this:

regex re("(\\d*)");

或者,也许你需要调用 regex_search 反复:

char *where = s.c_str();
while (regex_search(s.c_str(), matches, re))
{
  where = m.suffix().first;
}

这是因为你只拥有一个捕捉你的正则表达式。

This is since you only have one capture in your regex.

另外,改变你的正则表达式,如果你知道数据的基本结构:

Alternatively, change your regex, if you know the basic structure of the data:

regex re("(\\d+).*?(\\d+)");

这将搜索匹配字符串内的两个数字。

This would match two numbers within the search string.

请注意,经常前pression \\ D *将匹配零个或多个数字 - 这包括空字符串,因为这正是零位。我会改变前pression到\\ D +将匹配1以上。

Note that the regular expression \d* will match zero or more digits - this includes the empty string "" since this is exactly zero digits. I would change the expression to \d+ which will match 1 or more.

这篇关于升压正则表达式未正常工作在我的code的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆