特定标签及其内容的正则表达式,按标签名称分组 [英] Regex for specifig tags and their content, groupped by the tag name

查看:21
本文介绍了特定标签及其内容的正则表达式,按标签名称分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是输入(html,不是 xml):

Here is the input (html, not xml):

... html content ...
<tag1> content for tag 1 </tag1>
<tag2> content for tag 2 </tag2>
<tag3> content for tag 3 </tag3>
... html content ...

我想要 3 场比赛,每场比赛分为两组.第一组将包含标签的名称,第二组将包含标签的内部文本.只有这三个标签,所以不需要通用.

I would like to get 3 matches, each with two groups. First group would contain the name of the tag and the second group would contain the inner text of the tag. There are just those three tags, so it doesn't need to be universal.

换句话说:

match.Groups["name"] would be "tag1"
match.Groups["value"] would be "content for tag 2"

有什么想法吗?

推荐答案

我不明白您为什么要为此使用匹配组名称.

I don't see why you would want to use match group names for that.

这是一个将标签名称和标签内容匹配到编号的子匹配项中的正则表达式.

Here is a regular expression that would match tag name and tag content into numbered sub matches.

<(tag1|tag2|tag3)>(.*?)</$1>

这是一个带有 .NET 样式组名称的变体

Here is a variant with .NET style group names

<(?'name'tag1|tag2|tag3)>(?'value'.*?)</\k'name'>.

编辑

RegEx 根据问题作者的说明进行了调整.

RegEx adapted as per question author's clarification.

这篇关于特定标签及其内容的正则表达式,按标签名称分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆