如何仅获取给定的捕获组< regex> C ++ [英] How to get only given captured group <regex> c++

查看：61 发布时间：2020/9/27 22:23:25 c++ regex c++11

本文介绍了如何仅获取给定的捕获组< regex> C ++的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想提取标签的内部内容。从下面的字符串中：

I want to extract tag's inner content. From the following string:

<tag1 val=123>Hello</tag1>

我只想得到

Hello

我做什么：

string s = "<tag1 val=123>Hello</tag1>";
regex re("<tag1.*>(.*)</tag1>");
smatch matches;
bool b = regex_match(s, matches, re);

但它会返回两个匹配项：

But it returns two matches:

<tag1 val=123>Hello</tag1>
Hello

当我尝试仅获得第一个这样捕获的组时：

And when I try to get only 1st captured group like this:

"<tag1.*>(.*)</tag1>\1"

我得到零匹配。

请告知。

推荐答案

regex_match 仅返回单个匹配项，其中包含所有捕获组子匹配项（它们的数量取决于模式中有多少个组）。

The regex_match returns only a single match, with all the capturing group submatches (their number depends on how many groups there are in the pattern).

在这里，您仅获得包含两个子匹配项的1个匹配项：1）完全匹配项，2）捕获第1组值。

Here, you only get 1 match that contains two submatches: 1) whole match, 2) capture group 1 value.

要获取捕获组的内容，您需要访问 matches 对象的第二个元素 matches [1] .str（） 或 matches.str（1）

To obtain the contents of the capturing group, you need to access the smatches object second element, matches[1].str() or matches.str(1)

请注意，当您写< tag1。*>（。*）< / tag1&\1 ， \1 是不是解析为 backreference ，而是解析为八进制代码1的字符。即使您定义了 backreference （如< tag1。*> ;（。*）< / tag1> \\1 ），则需要在< / tag1>之后重复捕获组1捕获的整个文本。 ; -绝对不是您想要的。实际上，我怀疑此正则表达式是否有用，至少您需要将。* 替换为 [\\s\ \S] *？，但是用正则表达式解析HTML仍然是一种脆弱的方法。


Note that when you write "<tag1.*>(.*)</tag1>\1", the \1  is not parsed as a backreference, but as a char with octal code 1. Even if you defined a backreference (as "<tag1.*>(.*)</tag1>\\1") you would require the whole text captured with the capturing group 1 to be repeated after </tag1>  - that is definitely not what you want. Actually, I doubt this regex is any good, at least, you need to replace ".*"  with "[\\s\\S]*?", but it is still a fragile approach to parse HTML with regex.

                        这篇关于如何仅获取给定的捕获组&lt; regex&gt; C ++的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何仅获取给定的捕获组< regex> C ++ [英] How to get only given captured group <regex> c++

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

如何仅获取给定的捕获组&lt; regex&gt; C ++ [英] How to get only given captured group &lt;regex&gt; c++

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

如何仅获取给定的捕获组< regex> C ++ [英] How to get only given captured group <regex> c++

登录关闭