如何获取正则表达式以匹配多个脚本标签? [英] How to get regex to match multiple script tags?

查看:48
本文介绍了如何获取正则表达式以匹配多个脚本标签?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试返回文本正文中任何标签的内容.我当前正在使用以下表达式,但是它仅捕获第一个标签的内容,之后将忽略其他任何标签.

I'm trying to return the contents of any tags in a body of text. I'm currently using the following expression, but it only captures the contents of the first tag and ignores any others after that.

这是html的示例:

    <script type="text/javascript">
        alert('1');
    </script>

    <div>Test</div>

    <script type="text/javascript">
        alert('2');
    </script>

我的正则表达式如下:

//scripttext contains the sample
re = /<script\b[^>]*>([\s\S]*?)<\/script>/gm;
var scripts  = re.exec(scripttext);

当我在IE6上运行它时,它返回2个匹配项.第一个包含full标签,第二个包含alert('1').

When I run this on IE6, it returns 2 matches. The first containing the full tag, the 2nd containing alert('1').

当我在http://www.pagecolumn.com/tool/regtest.htm时它给了我2个结果,每个结果仅包含脚本标签.

When I run it on http://www.pagecolumn.com/tool/regtest.htm it gives me 2 results, each containing the script tags only.

推荐答案

此处的问题"是 exec 的工作方式.它仅匹配第一次出现,但将当前索引(即插入符号位置)存储在正则表达式的 lastIndex 属性中.要获取所有匹配项,只需将正则表达式应用于字符串,直到不匹配为止(这是一种很常见的方式):

The "problem" here is in how exec works. It matches only first occurrence, but stores current index (i.e. caret position) in lastIndex property of a regex. To get all matches simply apply regex to the string until it fails to match (this is a pretty common way to do it):

var scripttext = ' <script type="text/javascript">\nalert(\'1\');\n</script>\n\n<div>Test</div>\n\n<script type="text/javascript">\nalert(\'2\');\n</script>';

var re = /<script\b[^>]*>([\s\S]*?)<\/script>/gm;

var match;
while (match = re.exec(scripttext)) {
  // full match is in match[0], whereas captured groups are in ...[1], ...[2], etc.
  console.log(match[1]);
}

这篇关于如何获取正则表达式以匹配多个脚本标签?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆