Word边界正则表达式,除非在HTML标签内 [英] Word Boundary Regular Expression Unless Inside HTML Tag

查看:152
本文介绍了Word边界正则表达式,除非在HTML标签内的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用单词边界的正则表达式,它的效果非常好。


$ b $ p 〜\b('。$ value。' )\ b〜i



...因为它匹配HTML标签内的文本(即 title =这是蓝色! )。这是一个问题,因为我正在对任何正则表达式匹配的文本进行替换,然后使用标题标记显示工具提示。所以,正如你可以想象的那样,它将替换标题中的文本并打破工具提示的HTML。例如,应该是什么:

< span class =bluetitle =This is blue!> Aqua< / span> ;



...最后变成...



< span class =bluetitle =这是< span class =blue=>皇家蓝< / span>& gt; Aqua< / span>



我使用strip_tags并未解决问题;我认为我需要的是一个更好的正则表达式,它无法匹配以 blue>> 结尾的内容(在本例中为blue任何人都可以追加我需要的正则表达式吗?或者你有更好的解决方案吗?

解决方案

正则替换通常看起来像解决方案,但它们可能有很多不良的副作用,并没有真正实现你但是如果你坚持使用正则表达式,请注意DOMDocument模型(正如一些评论者所建议的那样)。 / questions / 12532744 / regex-ignore-matches-between-script-tags>这里是一篇关于SO的好文章,它使用两遍来完成你想要的内容。


I have a regular expression using word boundaries that works exceedingly well...

~\b('.$value.')\b~i

...save for the fact that it matches text inside HTML tags (i.e. title="This is blue!"). It's a problem because I'm doing text substitution on anything the regex matches, then making tooltips appear using those title tags. So, as you can imagine, it's substituting text inside the title and breaking the HTML of the tooltip. For example, what should be:

<span class="blue" title="This is blue!">Aqua</span>

...ends up becoming...

<span class="blue" title="This is <span class=" blue"="">Royal Blue</span>"&gt;Aqua</span>

My use of strip_tags didn't solve the issue; I think what I need is a better regular expression which simply will not match content ending in blue"> ('blue' in this case being placeholder for any other color in the array I'm comparing it against).

Can anyone append what I need to the regular expression? Or do you have a better solution?

解决方案

Regex replaces often seem like the solution but they can have a lot of ill side-effects, and not really accomplish what you want. Look into DOMDocument models instead (as some commenters have suggested).

But if you insist on using regex, here's a good post on SO. It uses two passes to accomplish what you want.

这篇关于Word边界正则表达式,除非在HTML标签内的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆