正则表达式与while循环 [英] regex vs while loops

查看:43
本文介绍了正则表达式与while循环的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在阅读此内容时这样的帖子-是否有JavaScript的String.indexOf()版本允许使用正则表达式?)我在思考以下两个函数中的哪个寻找 txt 中的最后一个(最大)空白组.运行得更快(或者它们的运行时差可以忽略不计)

While reading this SO post - Is there a version of JavaScript's String.indexOf() that allows for regular expressions?) I ponder which of the following two functions that look for the last (largest) whitespace group in txt run faster (or do they have negligible run-time difference)


(function(str)
{   
    var result = /\s+(?!.*\s+)/.exec(str);
    return ((result)? result.index : -1);
})(txt);


(function(str)
{
   var regex = /\s+/g;
   var result;
   var index = -1;
   while(result = regex.exec(str))
   {
       index = result.index;
   }
   return index;
})(txt);

简而言之,第一个使用正则表达式来查找没有其他空白组跟随的空白组,第二个使用while循环.

Briefly, the first uses a regex expression to look for a whitespace group that is not followed by any other whitespace groups, and the second uses a while loop.

在此问题上的任何帮助都将受到赞赏.

Any help with this matter is much appreciated.

推荐答案

(function(str)
{   
    var result = /\s+(?!.*\s+)/.exec(str);
    return ((result)? result.index : -1);
})(txt);

已损坏.它将匹配"\ n" ,因为.并不匹配所有空格字符.具体来说,它与由 \ s 匹配的空格字符"\ r \ n \ u2028 \ u2029" 不匹配.

is broken. It will match " \n" because . does not match all space characters. Specifically it does not match the space characters "\r\n\u2028\u2029" which are matched by \s.

如果您想找到匹配 txt 中最后一个(最大)空白组的好方法,请使用下面的 RegExp

If you want a good way to match the last (largest) whitespace group in txt, use the RegExp below with String.prototype.search:

var indexOfStartOfLastWhitespaceGroup = str.search(/\s+\S*$/);

要获取结束索引,您不能使用正则表达式的 .lastIndex 属性,因为它包含 \ S * 部分.不过,您可以再次使用 .search .

To get the end index, you can't use the .lastIndex property of the regular expression since it includes the \S* portion. You can use .search again though.

if (indexOfStartOfLastWhitespaceGroup >= 0) {
  var indexOfEndOfLastWhitespaceGroup = str.search(/\S*$/);
  ...
}

我考虑以下两个函数中寻找txt中最后一个(最大)空白组的函数运行速度更快(或者它们的运行时差可忽略不计)

I ponder which of the following two functions that look for the last (largest) whitespace group in txt run faster (or do they have negligible run-time difference)

对于小字符串,无论使用哪种(正确)方法,结果都可以忽略不计.对于大字符串,遍历整个字符串将非常昂贵,因此,最好的选择是使用固定在末尾的正则表达式,即以 $ 作为最后一个标记,并且不包含其中的 ^ .当存在仅右锚定的正则表达式时,解释器可能会浪费时间进行全字符串搜索,但是我相信大多数人都会进行这种简单的优化.

For small strings the result is likely negligible no matter what (correct) method you use. For large strings, iterating over the whole string is going to be expensive, so your best bet is to use a regular expression that is anchored at the end, i.e. has $ as the last token and does not have ^ in it. An interpreter can waste time doing a full string search when there is a right-only-anchored regex, but I believe most do this simple optimization.

这就是我在chrome下的 squarefree shell 上得到的内容.

This is what I get on squarefree shell under chrome.

var s = '';
for (var i = 10000; --i >= 0;) s += 'abba';
s += 'foo';
var t0 = Date.now(); for (var i = 100; --i >= 0;) /foo$/.test(s); var t1 = Date.now();
var t2 = Date.now(); for (var i = 100; --i >= 0;) /abbafoo/.test(s); var t3 = Date.now();
[t1 - t0, t3 - t2]
// emits [1, 8]

最后,您应该意识到 \ s 并不总是在所有解释器中都具有相同的含义./\ s/.test("\ xA0")测试不中断空格(认为& nbsp; )是否为空格,在IE 6上为false,但是在大多数其他浏览器的解释器中为true(不确定IE 7 +).

Finally, you should be aware that \s does not always mean the same thing on all interpreters. /\s/.test("\xA0") which tests whether the non-breaking space (think  ) is a space is false on IE 6 but true on most other browsers' interpreters (not sure about IE 7+).

这篇关于正则表达式与while循环的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆