String.replaceAll 比自己做工作慢得多 [英] String.replaceAll is considerably slower than doing the job yourself

查看:37
本文介绍了String.replaceAll 比自己做工作慢得多的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一段旧代码,用于在字符串中查找和替换标记.

I have an old piece of code that performs find and replace of tokens within a string.

它接收 fromto 对的映射,遍历它们,对于这些对中的每一个,遍历目标字符串,寻找 from 使用 indexOf(),并将其替换为 to 的值.它在 StringBuffer 上完成所有工作并最终返回一个 String.

It receives a map of from and to pairs, iterates over them and for each of those pairs, iterates over the target string, looks for the from using indexOf(), and replaces it with the value of to. It does all the work on a StringBuffer and eventually returns a String.

我用这一行替换了该代码:replaceAll("[,. ]*", "");
我进行了一些比较性能测试.
在比较 1,000,000 次迭代时,我得到了这个:

I replaced that code with this line: replaceAll("[,. ]*", "");
And I ran some comparative performance tests.
When comparing for 1,000,000 iterations, I got this:

旧代码:1287ms
新代码:4605ms

Old Code: 1287ms
New Code: 4605ms

长 3 倍!

然后我尝试用 3 次对 replace 的调用来替换它:
replace(",", "");
replace(".", "");
replace(" ", "");

I then tried replacing it with 3 calls to replace:
replace(",", "");
replace(".", "");
replace(" ", "");

结果如下:

旧代码:1295
新代码:3524

Old Code: 1295
New Code: 3524

长 2 倍!

知道为什么 replacereplaceAll 效率这么低吗?我可以做些什么来让它更快吗?

Any idea why replace and replaceAll are so inefficient? Can I do something to make it faster?

感谢所有的回答 - 主要问题确实是 [,.]* 没有做我想要它做的事情.将其更改为 [,.]+ 几乎等于基于非正则表达式的解决方案的性能.使用预编译的正则表达式有帮助,但微不足道.(这是一个非常适用于我的问题的解决方案.

Thanks for all the answers - the main problem was indeed that [,. ]* did not do what I wanted it to do. Changing it to be [,. ]+ almost equaled the performance of the non-Regex based solution. Using a pre-compiled regex helped, but was marginal. (It is a solution very applicable for my problem.

测试代码:
用正则表达式替换字符串:[,.]*
用正则表达式替换字符串:[,.]+
用正则表达式替换字符串:[,.]+ 和预编译模式

推荐答案

虽然使用正则表达式会带来一些的性能影响,但它不应该那么糟糕.

While using regular expressions imparts some performance impact, it should not be as terrible.

注意使用 String.replaceAll() 将编译正则表达式每次你调用它.

您可以通过显式使用 Pattern 对象:

You can avoid that by explicitly using a Pattern object:

Pattern p = Pattern.compile("[,. ]+");

// repeat only the following part:
String output = p.matcher(input).replaceAll("");

另请注意,使用 + 而不是 * 可避免替换空字符串,因此也可能会加快进程.

Note also that using + instead of * avoids replacing empty strings and therefore might also speed up the process.

这篇关于String.replaceAll 比自己做工作慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆