正则表达式替换html标签之外的文本 [英] Regex replace text outside html tags
问题描述
我有这个HTML:
这是简单的html文本< span class ='simple'>简单的简单text text< / span> text
我只需要匹配HTML标签之外的单词。我的意思是,如果我想匹配简单和文本,我应该只从这是简单的html文本和最后一部分文本中得到结果 - 结果将是简单1匹配,文本2火柴。任何人都可以帮助我吗?
var pattern = new RegExp((\\+ value +\ \b),'gi');
if(pattern.test(text)){
text = text.replace(pattern,< span class ='notranslate'> $ 1< / span>);
-
值
是我想匹配的词(在本例中为简单)
text
是这是简单的html文本< span class ='simple'>简单的简单文本文本< / span>文本< / code>
b
我需要用
< span>
来包装所有选定的单词(在本例中它是简单的)。但是我只想包装任何 HTML标签之外的单词。这个例子的结果应该是< / b>
这是< span class ='notranslate'>简单< / span> html< span class ='notranslate'>文字< / span> < span class ='simple'>简单的简单文字文字< / span> < span class ='notranslate'>文字< / span>
我不想替换
中的任何文本
< span class ='simple'>简单的简单文字文字< / span>
它应该和之前的替换一样。
好的,试试用这个正则表达式:
(text | simple)( [b]
$ b
细分: (#打开捕获组
文本#匹配'文本'
|#或者
简单的#
$ b匹配'简单'
)#结束捕获组
(?!#负向前瞻开始(如果内容匹配,将导致匹配失败)
[^< '<'字符
>#A>字符
|#或
[^<>] *#任意数量的非'&'和非'> ;'字符
< /#字符<和/
)#结束负向预测。
如果
text
>或简单
位于html标签之间。I have this HTML:
"This is simple html text <span class='simple'>simple simple text text</span> text"
I need to match only words that are outside any HTML tag. I mean if I want to match "simple" and "text" I should get the results only from "This is simple html text" and the last part "text"—the result will be "simple" 1 match, "text" 2 matches. Could anyone help me with this? I’m using jQuery.
var pattern = new RegExp("(\\b" + value + "\\b)", 'gi'); if (pattern.test(text)) { text = text.replace(pattern, "<span class='notranslate'>$1</span>"); }
value
is the word I want to match (in this case "simple")text
is"This is simple html text <span class='simple'>simple simple text text</span> text"
I need to wrap all selected words (in this example it is "simple") with
<span>
. But I want to wrap only words that are outside any HTML tags. The result of this example should beThis is <span class='notranslate'>simple</span> html <span class='notranslate'>text</span> <span class='simple'>simple simple text text</span> <span class='notranslate'>text</span>
I do not want replace any text inside
<span class='simple'>simple simple text text</span>
It should be the same as before replacement.
解决方案Okay, try using this regex:
(text|simple)(?![^<]*>|[^<>]*</)
Breakdown:
( # Open capture group text # Match 'text' | # Or simple # Match 'simple' ) # End capture group (?! # Negative lookahead start (will cause match to fail if contents match) [^<]* # Any number of non-'<' characters > # A > character | # Or [^<>]* # Any number of non-'<' and non-'>' characters </ # The characters < and / ) # End negative lookahead.
The negative lookahead will prevent a match if
text
orsimple
is between html tags.这篇关于正则表达式替换html标签之外的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文