正则表达式替换脚本标签外的文本 [英] Regex replace text outside script tag
问题描述
我有这个HTML:
"This is simple html text <script language="javascript">simple simple text text</script> text"
我只需要匹配脚本标签之外的单词.我的意思是,如果我想匹配简单"和文本",则只能从这是简单的html文本"和最后一部分文本"中得到结果-结果将是简单" 1匹配,文本" 2火柴.有人可以帮我吗?我正在使用PHP.
I need to match only words that are outside script tag. I mean if I want to match "simple" and "text" I should get the results only from "This is simple html text" and the last part "text" — the result will be "simple" 1 match, "text" 2 matches. Could anyone help me with this? I’m using PHP.
对于标记外的匹配文本,我找到了类似的答案:
I found a similar answer for match text outside a tag:
(text|simple)(?![^<]*>|[^<>]*</)
但是不能使用特定的标签(脚本)
But couln't put to work for a specific tag (script):
(text|simple)(?!(^<script*>)|[^<>]*</)
ps:此问题不是重复的问题( strip_tags,删除javascript ).因为我不是要剥离标签,也不是要在脚本标签内选择内容.我正在尝试替换标签脚本"之外的内容.
ps: This question is not a duplicate (strip_tags, remove javascript). 'Cause i´m not trying to strip tags, or select the content inside the script tag. i´m trying replace content outside the tag "script".
推荐答案
我的模式将使用(*SKIP)(*FAIL)
取消匹配的脚本标签及其内容的资格.
My pattern will use (*SKIP)(*FAIL)
to disqualify matched script tags and their contents.
text
和simple
将在每次符合条件的匹配项上匹配.
text
and simple
will be match on every qualifying occurrence.
正则表达式模式:~<script.*?/script>(*SKIP)(*FAIL)|text|simple~
代码:( 演示)
$strings=['This has no replacements',
'This simple text has no script tag',
'This simple text ends with a script tag <script language="javascript">simple simple text text</script>',
'This is simple html text is split by a script tag <script language="javascript">simple simple text text</script> text',
'<script language="javascript">simple simple text text</script> this text starts with a script tag'
];
$strings=preg_replace('~<script.*?/script>(*SKIP)(*FAIL)|text|simple~','***replaced***',$strings);
var_export($strings);
输出:
array (
0 => 'This has no replacements',
1 => 'This ***replaced*** ***replaced*** has no script tag',
2 => 'This ***replaced*** ***replaced*** ends with a script tag <script language="javascript">simple simple text text</script>',
3 => 'This is ***replaced*** html ***replaced*** is split by a script tag <script language="javascript">simple simple text text</script> ***replaced***',
4 => '<script language="javascript">simple simple text text</script> this ***replaced*** starts with a script tag',
)
这篇关于正则表达式替换脚本标签外的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!