Firefox错误的RegEx性能 [英] Firefox bad RegEx performance

查看:146
本文介绍了Firefox错误的RegEx性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用JavaScript解析器生成器 JISON 为我的用户创建的一些脚本创建解析器。最近我注意到,Firefox上的解析过程比我的页面支持的任何其他浏览器(IE10,最新的Chrome和Opera)要慢很多。

在对生成的解析器的源代码进行了一点挖掘之后,我将问题简化为一行代码,它执行一些正则表达式来标记要解析的代码。当然这条线经常执行。

我用一些随机字符串(~1300个字符长)和一个非常通用的正则表达式创建了一个小测试用例。这个测试用例测量执行正则表达式所需的平均时间10000次( JSFiddle上的工作示例):

  $(document).ready(function(){
var str ='asdfasdfa asdfasdf asdf asdf asdfasddlfkja asldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDF asdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf asdf asdfasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölka SJDflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdf löaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfjölkasjdflöaksjdflöaskdfjkasdfasdfa asdfasdf ASDF ASDFasdfasödlfkjaasldfkjasdölkfjaslödkjfaösldkfj,
正则表达式=新正则表达式( '^([0-9])+'),
durations = [],
resHtml ='持续时间:',
totalDuration = 0,
匹配,开始;

//执行10次计时测试,得到平均持续时间
(var i = 0; i <10; i ++){
//执行正则表达式10000次,看看需要多少时间
start = window.performance.now();
for(var j = 0; j <10000; j ++){
regex.exec(str);
}
durations.push(window.performance.now() - start);


//创建输出字符串并更新DIV
(var i = 0; i< durations.length; i ++){
totalDuration + = durations [一世];
resHtml + ='< br>'+ i +':'+(parseInt(durations [i] * 100,10)/ 100)+'ms'
}
resHtml + ='< br> =========='; $(b)resHtml + ='< br> Avg:'+(parseInt((totalDuration / durations.length)* 100,10)/ 100)+'ms'

$('#result')。html(resHtml);
});

以下是我机器上的测试结果:

Firefox 24 :平均时间介于 370& 450毫秒为10000正则表达式执行

Chrome 30,Opera 17,IE 10 :平均时间介于 0.3& 0.6毫秒



如果要测试的字符串变大,这种差异会变得更大。一个6000字符的字符串会将Firefox的平均时间增加到<1.5秒(!),而其他浏览器仍然需要〜0.5毫秒(!) JSFiddle上的6000个字符的工作示例)。

为什么Firefox之间有这么大的性能差异和所有其他浏览器,我可以改善它吗?

请注意,我不能调整执行正则表达式自己,因为他们主要是由解析器生成器生成,我不

解决方案

它是 RegExp 捕获分组得到你:
$ b $ / ^ [0-9] + / 和/或 / ^(?:[0-9])+ / 和/或 / ^([0-9] +)/ / ^([0-9])+ / 快了几个数量级。他们应该是可行的替代方案。

我认为在捕获组的时候它会稍微慢一些,但是这样做会让我感到惊讶。然而,缓慢的版本有可能创造大量的捕获,而其他版本则不然,所以这似乎是一个重要的区别。

不科学的jsperf a>。



您可能需要提交错误


I use the JavaScript parser generator JISON to create a parser for some scripts that my users create. Lately I've noticed that the parsing process on Firefox is by a large factor slower than on any other Browser my page supports (IE10, latest Chrome & Opera).

After digging a little into the source of the generated parser I've narrowed the problem down to one line of code which executes some regex to tokenize the code to parse. Of course this line is executed pretty often.

I've created a little test case with some random string (~ 1300 characters long) and a pretty generic regex. This test case measures the average time it takes to execute the regex 10000 times (Working example on JSFiddle):

$(document).ready(function() {
    var str = 'asdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj ölkasjd flöaksjdf löask dfjkasdfasdfa asdfasdf asdf asdf asdfasödlfkja asldfkj asdölkfj aslödkjf aösldkfj',
        regex = new RegExp('^([0-9])+'),
        durations = [],
        resHtml = 'Durations:',
        totalDuration = 0,
        matches, start;

    // Perform "timing test" 10 times to get some average duration
    for (var i = 0; i < 10; i++) {
        // Execute regex 10000 times and see how long it takes
        start = window.performance.now();
        for (var j = 0; j < 10000; j++) {
            regex.exec(str);
        }
        durations.push(window.performance.now() - start);
    }

    // Create output string and update DIV
    for (var i = 0; i < durations.length; i++) {
        totalDuration += durations[i];
        resHtml += '<br>' + i + ': ' + (parseInt(durations[i] * 100, 10) / 100) + ' ms';
    }
    resHtml += '<br>==========';
    resHtml += '<br>Avg: ' + (parseInt((totalDuration / durations.length) * 100, 10) / 100) + ' ms';

    $('#result').html(resHtml);
});

Following are the test results on my machine:

Firefox 24: Average time is between 370 & 450 ms for 10000 regex executions
Chrome 30, Opera 17, IE 10: Average time is between 0.3 & 0.6 ms

This difference gets even larger if the string to test gets bigger. A 6000 character long string increases the average time in Firefox to ~ 1.5 seconds (!) while the other browsers still need ~ 0.5 milliseconds (!) (Working example on JSFiddle with 6000 characters).

Why is there such a big performance difference between Firefox and all other browsers and can I improve it anyhow?

Please note that I can't adjust the executed regexes themeselves because they are mostly generated by the parser generator and I don't want to manually change the built parser code.

解决方案

It is the RegExp capturing grouping that got you:

/^[0-9]+/ and/or /^(?:[0-9])+/ and/or /^([0-9]+)/ are orders of magnitude faster than /^([0-9])+/. And they should be viable alternatives.

I would expect it to be slightly slower with capturing groups, but that it is that much slower surprises me. However the slow version has the potential to create lots and lots of captures, while the other versions do not, so that seems to be an important difference.

Unscientific jsperf.

You may want to file a bug.

这篇关于Firefox错误的RegEx性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆