正则表达式匹配打开和关闭标签以及该标签内的特定文本模式 [英] regex matching an open and close tag and a certain text patterns inside that tag
问题描述
这是我从 sitemap.xml 中获得的示例自定义标签
<loc>http://sitename.com/programming/php/?C=D;O=A</loc><changefreq>每周</changefreq><优先级>0.64</优先级>
有很多这样的条目,如果你看到 loc 标签,它的末尾有 c=d;0=a.我想删除所有以 <url>
开头的条目,以 </url>
结尾,其中包含 C=D;0=A 或类似的模式.>
下面的表达式匹配了上面指定的整个标签
(.|\r\n)*?<\/url>
但我想匹配我在上面声明中指定的内容.
我们如何形成正则表达式来匹配这样的条件(模式)?
试试这个:
/(?:(?!<\/url>).)*C=D;O=A.*?<\/url>/m
否定前瞻保证您不匹配多个节点.
参见此处:rubular
Here is a sample custom tag i have from a sitemap.xml
<url>
<loc>http://sitename.com/programming/php/?C=D;O=A</loc>
<changefreq>weekly</changefreq>
<priority>0.64</priority>
</url>
There are many entries like this and if you see loc tag it has c=d;0=a at the end.
I want to remove all entries starting with <url>
ending with </url>
which contains C=D;0=A or similar patterns like that.
The following expression matched the whole of the above specified tag
<url>(.|\r\n)*?<\/url>
but I want to match like what i had specified in the above statement.
How do we form regex to match such conditions(patterns) ?
Try this:
/<url>(?:(?!<\/url>).)*C=D;O=A.*?<\/url>/m
The negative lookahead guaranties that you do not match multiple nodes.
See here: rubular
这篇关于正则表达式匹配打开和关闭标签以及该标签内的特定文本模式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!