JavaScript动态与内联RegExp性能 [英] Dynamic vs Inline RegExp performance in JavaScript

查看:53
本文介绍了JavaScript动态与内联RegExp性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我偶然发现了该性能测试,并说JavaScript中的RegExp不一定很慢: http://jsperf.com/regexp-indexof-perf

I stumbled upon that performance test, saying that RegExps in JavaScript are not necessarily slow: http://jsperf.com/regexp-indexof-perf

我没有发现一件事:有两个案件涉及我认为完全相同的事情:

There's one thing i didn't get though: two cases involve something that i believed to be exactly the same:

RegExp('(?:^| )foo(?: |$)').test(node.className);

还有

/(?:^| )foo(?: |$)/.test(node.className);

在我看来,这两行完全是相同的,第二行是创建RegExp对象的某种简写.不过,它比第一个要快两倍.

In my mind, those two lines were exactly the same, the second one being some kind of shorthand to create a RegExp object. Still, it's twice faster than the first.

这些情况称为动态正则表达式"和内联正则表达式".

Those cases are called "dynamic regexp" and "inline regexp".

有人可以帮我理解这两者之间的区别(和性能差距)吗?

Could someone help me understand the difference (and the performance gap) between these two?

推荐答案

现在,此处给出的答案并不完全/正确.

Nowadays, answers given here are not entirely complete/correct.

从ES5开始,字面语法行为与有关对象创建的 RegExp()语法相同:它们两者每次都创建一个新的RegExp对象他们参与其中的一种表达方式.

Starting from ES5, the literal syntax behavior is the same as RegExp() syntax regarding object creation: both of them creates a new RegExp object every time code path hits an expression in which they are taking part.

因此,它们之间的唯一区别是正则表达式的编译频率:

  • 使用文字语法-初始代码解析期间一次编译
  • 使用 RegExp()语法-每次创建新对象
  • With literal syntax - one time during initial code parsing and compiling
  • With RegExp() syntax - every time new object gets created

例如,参见 Stoyan Stefanov的JavaScript模式图书:

正则表达式文字与构造函数是文字在创建过程中仅创建一次对象解析时间.如果您在循环中创建相同的正则表达式,则先前创建的对象将返回其所有属性(例如lastIndex)已从第一次设置.考虑一下以下示例说明了同一对象是如何返回两次.

Another distinction between the regular expression literal and the constructor is that the literal creates an object only once during parse time. If you create the same regular expression in a loop, the previously created object will be returned with all its properties (such as lastIndex) already set from the first time. Consider the following example as an illustration of how the same object is returned twice.

function getRE() {
    var re = /[a-z]/;
    re.foo = "bar";
    return re;
}

var reg = getRE(),
    re2 = getRE();

console.log(reg === re2); // true
reg.foo = "baz";
console.log(re2.foo); // "baz"

此行为在ES5中已更改,并且文字也创建了新对象.在许多浏览器中,该行为也已得到纠正环境,因此不必依赖它.

如果在所有现代浏览器或NodeJS中运行此示例,则会得到以下内容:

If you run this sample in all modern browsers or NodeJS, you get the following instead:

false
bar

这意味着,每次您调用 getRE()函数时,即使使用文字语法方法,也会创建一个新的 RegExp 对象

Meaning that every time you're calling the getRE() function, a new RegExp object is created even with literal syntax approach.

上面的内容不仅解释了为什么不应该对不变的正则表达式使用 RegExp()(今天这是众所周知的性能问题),而且还解释了:

The above not only explains why you shouldn't use the RegExp() for immutable regexps (it's very well known performance issue today), but also explains:

(令我惊讶的是,inlineRegExp和storedRegExp有不同之处结果.)

(I am more surprised that inlineRegExp and storedRegExp have different results.)

在所有浏览器中, storedRegExp 的速度比 inlineRegExp 快约5-20%,这是因为创建(和垃圾收集)新的 RegExp <没有开销./code>对象.

The storedRegExp is about 5 - 20% percent faster across browsers than inlineRegExp because there is no overhead of creating (and garbage collecting) a new RegExp object every time.

结论:
始终使用文字语法创建不可变的正则表达式,并在要重新使用它时对其进行缓存.换句话说,不要依赖ES5以下环境中行为的差异,而要继续使用高于ES5环境中环境的缓存.

Conclusion:
Always create your immutable regexps with literal syntax and cache it if it's to be re-used. In other words, don't rely on that difference in behavior in envs below ES5, and continue caching appropriately in envs above.

为什么使用文字语法?与构造函数语法相比,它具有一些优点:

Why literal syntax? It has some advantages comparing to constructor syntax:

  1. 时间较短,不会强迫您按照类的方式进行思考构造函数.
  2. 使用 RegExp()构造函数时,还需要转义引号和双转义反斜杠.它使正则表达式很难阅读和理解它们的本性.
  1. It is shorter and doesn’t force you to think in terms of class-like constructors.
  2. When using the RegExp() constructor, you also need to escape quotes and double-escape backslashes. It makes regular expressions that are hard to read and understand by their nature even more harder.

(来自相同 Stoyan Stefanov的JavaScript模式的免费引用书).
因此,始终坚持使用文字语法,除非在编译时不知道您的正则表达式,否则总是一个好主意.

(Free citation from the same Stoyan Stefanov's JavaScript Patterns book).
Hence, it's always a good idea to stick with the literal syntax, unless your regexp isn't known at the compile time.

这篇关于JavaScript动态与内联RegExp性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆