正则表达式替换脚本标签外的文本 [英] Regex replace text outside script tag

查看:93
本文介绍了正则表达式替换脚本标签外的文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有这个HTML:


"This is simple html text <script language="javascript">simple simple text text</script> text"

我只需要匹配脚本标签之外的单词.我的意思是,如果我想匹配简单"和文本",则只能从这是简单的html文本"和最后一部分文本"中得到结果-结果将是简单" 1匹配,文本" 2火柴.有人可以帮我吗?我正在使用PHP.

I need to match only words that are outside script tag. I mean if I want to match "simple" and "text" I should get the results only from "This is simple html text" and the last part "text" — the result will be "simple" 1 match, "text" 2 matches. Could anyone help me with this? I’m using PHP.

对于标记外的匹配文本,我找到了类似的答案:

I found a similar answer for match text outside a tag:

(text|simple)(?![^<]*>|[^<>]*</)

正则表达式替换html标签之外的文本

但是不能使用特定的标签(脚本)

But couln't put to work for a specific tag (script):

(text|simple)(?!(^<script*>)|[^<>]*</)

ps:此问题不是重复的问题( strip_tags,删除javascript ).因为我不是要剥离标签,也不是要在脚本标签内选择内容.我正在尝试替换标签脚本"之外的内容.

ps: This question is not a duplicate (strip_tags, remove javascript). 'Cause i´m not trying to strip tags, or select the content inside the script tag. i´m trying replace content outside the tag "script".

推荐答案

我的模式将使用(*SKIP)(*FAIL)取消匹配的脚本标签及其内容的资格.

My pattern will use (*SKIP)(*FAIL) to disqualify matched script tags and their contents.

textsimple将在每次符合条件的匹配项上匹配.

text and simple will be match on every qualifying occurrence.

正则表达式模式:~<script.*?/script>(*SKIP)(*FAIL)|text|simple~

样式/替换演示链接

代码:( 演示)

$strings=['This has no replacements',
    'This simple text has no script tag',
    'This simple text ends with a script tag <script language="javascript">simple simple text text</script>',
    'This is simple html text is split by a script tag <script language="javascript">simple simple text text</script> text',
    '<script language="javascript">simple simple text text</script> this text starts with a script tag'
];

$strings=preg_replace('~<script.*?/script>(*SKIP)(*FAIL)|text|simple~','***replaced***',$strings);

var_export($strings);

输出:

array (
  0 => 'This has no replacements',
  1 => 'This ***replaced*** ***replaced*** has no script tag',
  2 => 'This ***replaced*** ***replaced*** ends with a script tag <script language="javascript">simple simple text text</script>',
  3 => 'This is ***replaced*** html ***replaced*** is split by a script tag <script language="javascript">simple simple text text</script> ***replaced***',
  4 => '<script language="javascript">simple simple text text</script> this ***replaced*** starts with a script tag',
)

这篇关于正则表达式替换脚本标签外的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆