String.replaceAll 比自己做工作慢得多 [英] String.replaceAll is considerably slower than doing the job yourself
问题描述
我有一段旧代码,用于在字符串中查找和替换标记.
I have an old piece of code that performs find and replace of tokens within a string.
它接收 from
和 to
对的映射,遍历它们,对于这些对中的每一个,遍历目标字符串,寻找 from
使用 indexOf()
,并将其替换为 to
的值.它在 StringBuffer
上完成所有工作并最终返回一个 String
.
It receives a map of from
and to
pairs, iterates over them and for each of those pairs, iterates over the target string, looks for the from
using indexOf()
, and replaces it with the value of to
. It does all the work on a StringBuffer
and eventually returns a String
.
我用这一行替换了该代码:replaceAll("[,. ]*", "");
我进行了一些比较性能测试.
在比较 1,000,000
次迭代时,我得到了这个:
I replaced that code with this line: replaceAll("[,. ]*", "");
And I ran some comparative performance tests.
When comparing for 1,000,000
iterations, I got this:
旧代码:1287ms
新代码:4605ms
Old Code: 1287ms
New Code: 4605ms
长 3 倍!
然后我尝试用 3 次对 replace
的调用来替换它:replace(",", "");
replace(".", "");
replace(" ", "");
I then tried replacing it with 3 calls to replace
:
replace(",", "");
replace(".", "");
replace(" ", "");
结果如下:
旧代码:1295
新代码:3524
Old Code: 1295
New Code: 3524
长 2 倍!
知道为什么 replace
和 replaceAll
效率这么低吗?我可以做些什么来让它更快吗?
Any idea why replace
and replaceAll
are so inefficient? Can I do something to make it faster?
感谢所有的回答 - 主要问题确实是 [,.]*
没有做我想要它做的事情.将其更改为 [,.]+
几乎等于基于非正则表达式的解决方案的性能.使用预编译的正则表达式有帮助,但微不足道.(这是一个非常适用于我的问题的解决方案.
Thanks for all the answers - the main problem was indeed that [,. ]*
did not do what I wanted it to do. Changing it to be [,. ]+
almost equaled the performance of the non-Regex based solution.
Using a pre-compiled regex helped, but was marginal. (It is a solution very applicable for my problem.
测试代码:
用正则表达式替换字符串:[,.]*
用正则表达式替换字符串:[,.]+
用正则表达式替换字符串:[,.]+ 和预编译模式
推荐答案
虽然使用正则表达式会带来一些的性能影响,但它不应该那么糟糕.
While using regular expressions imparts some performance impact, it should not be as terrible.
注意使用 String.replaceAll()
将编译正则表达式每次你调用它.
您可以通过显式使用 Pattern
对象:
You can avoid that by explicitly using a Pattern
object:
Pattern p = Pattern.compile("[,. ]+");
// repeat only the following part:
String output = p.matcher(input).replaceAll("");
另请注意,使用 +
而不是 *
可避免替换空字符串,因此也可能会加快进程.
Note also that using +
instead of *
avoids replacing empty strings and therefore might also speed up the process.
这篇关于String.replaceAll 比自己做工作慢得多的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!