C ++正则表达式:哪个组匹配? [英] C++ regex: Which group matched?
问题描述
我有一个正则表达式,包含通过or条件连接的各个子组:
I have a regex containig various sub-groups which are connected through an or condition:
([[:alpha:]]+)|([[:digit:]]+)
当我匹配字符串 1 a 2
时,得到三个匹配项: 1
, a
和 2
When I match the string 1 a 2
, I get three matches: 1
, a
and 2
.
C ++中是否有一种方法可以确定哪些子模式匹配?
Is there a way in C++ to determine which of the sub-patterns matched?
推荐答案
不直接.
使用 std :: regex
库, match_result 类负责子匹配,它具有一个名为 std :: match_results :: size ,然后您可以找到子项的数量-比赛.
with the std::regex
library, match_result class takes care of the sub-match and it has a method named std::match_results::size and with that you can find the number of sub-match.
例如:
std::string str( "one two three four five" );
std::regex rx( "(\\w+)(\\w+)(\\w+)(\\w+)(\\w+)" );
std::match_results< std::string::const_iterator > mr;
std::regex_search( str, mr, rx );
std::cout << mr.size() << '\n'; // 6
此处的输出为 6 而不是 5 ,因为匹配本身也被计算在内.您可以通过 .str(number)
方法或 operator []
here the output is 6 not 5 because the match itself is counted as well. You can access them by .str( number )
method or operator[]
因此,由于子匹配是从从左到右进行计数的,因此您应该在看到 size 方法的输出后才能确定匹配组.
So because sub-match are counted form left-to-right you should after seeing the output of size method figure out witch group was matched.
如果将 rx 更改为(\\ w +)(\\ d +)(\\ w +)"
,则大小= 0
If you change the rx to "(\\w+)(\\d+)(\\w+)"
then the size = 0
如果将 rx 更改为(\\ w +).+"
,则大小为 2 .这意味着您有一个完全成功匹配和一个总和匹配
If you change the rx to "(\\w+).+"
then the size is 2. That means you have a whole successful match and a sum-match
例如:
std::string str( "one two three four five" );
std::regex rx( "(\\w+).+" );
std::match_results< std::string::const_iterator > mr;
std::regex_search( str, mr, rx );
std::cout << mr.str( 1 ) << '\n'; // one
std::cout << mr[ 1 ] << '\n'; // one
两者的输出为:一个
如果您只想打印子匹配项,则可以使用一个具有索引的简单循环,该索引从 1 开始strong>不是 0
And also if you want to print only the sub-match you can use a simple loop that has an index and this index starts from 1 not 0
例如:
std::string str( "one two three four five" );
std::regex rx( "(\\w+) \\w+ (\\w+) \\w+ (\\w+)" );
std::match_results< std::string::const_iterator > mr;
std::regex_search( str, mr, rx );
for( std::size_t index = 1; index < mr.size(); ++index ){
std::cout << mr[ index ] << '\n';
}
输出为:
one
three
five
通过说确定哪些子模式匹配
如果您的意思是指定应从搜索引擎返回哪个子匹配,则使用 std:答案为是:regex_token_iterator
,您可以确定:
By saying determine which of the sub-patterns matched
if you mean specify which sub-match should be return from the search-engine then the answer is yes by using std::regex_token_iterator
you can determine that:
Ex :(迭代每个匹配项的 second 个子匹配项)
Ex: (Iterate over second sub-match of each match )
std::string str( "How are you today ? I am fine . How about you ?" );
std::regex rx( "(\\w+) (\\w+) ?" );
std::match_results< std::string::const_iterator > mr;
std::regex_token_iterator< std::string::const_iterator > first( str.begin(), str.end(), rx, 2 ), last;
while( first != last ){
std::cout << first->str() << '\n';
++first;
}
最后一个参数为2 :(str.begin(),str.end(),rx,2)
,这意味着您只需要次子匹配.所以输出是:
the last parameter is 2 : ( str.begin(), str.end(), rx, 2 )
and it means you want only the second sub-match. So the output is:
are
today
am
about
这篇关于C ++正则表达式:哪个组匹配?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!