Java正则表达式:性能和替代 [英] java regular expressions: performance and alternative

查看:96
本文介绍了Java正则表达式:性能和替代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近,我不得不搜索许多字符串值,以查看哪个字符串与特定模式匹配.在用户输入搜索词之前,字符串值的数量和模式本身都不清楚.问题是,我每次应用程序运行以下行时都会注意到:

Recently I have been had to search a number of string values to see which one matches a certain pattern. Neither the number of string values nor the pattern itself is clear until a search term has been entered by the user. The problem is I have noticed each time my application runs the following line:

    if (stringValue.matches (rexExPattern))
    {
        // do something so simple
    }

大约需要40微秒.不用说当字符串值的数量超过几千个时,它会变得很慢.

it takes about 40 micro second. No need to say when the number of string values exceeds a few thousands, it'll be too slow.

模式类似于:

    "A*B*C*D*E*F*"

这里的A〜F只是示例,但是模式类似于上面的东西.请注意 *,该模式实际上在每次搜索时都会更改.例如,"A * B * C *"可能会更改为W * D * G * A *".

where A~F are just examples here, but the pattern is some thing like the above. Please note* that the pattern actually changes per search. For example "A*B*C*" may change to W*D*G*A*".

我想知道是否可以更好地替代上述模式,或更笼统地说,可以替代Java正则表达式.

I wonder if there is a better substitution for the above pattern or, more generally, an alternative for java regular expressions.

推荐答案

Java中的正则表达式被编译为内部数据结构.该编译是耗时的过程.每次调用方法 String.matches(String regex)时,都会再次编译指定的正则表达式.

Regular expressions in Java are compiled into an internal data structure. This compilation is the time-consuming process. Each time you invoke the method String.matches(String regex), the specified regular expression is compiled again.

因此,您应该只编译一次正则表达式并重新使用它:

So you should compile your regular expression only once and reuse it:

Pattern pattern = Pattern.compile(regexPattern);
for(String value : values) {
    Matcher matcher = pattern.matcher(value);
    if (matcher.matches()) {
        // your code here
    }
}

这篇关于Java正则表达式:性能和替代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆