java.util.regex - Pattern.compile()的重要性? [英] java.util.regex - importance of Pattern.compile()?

查看:146
本文介绍了java.util.regex - Pattern.compile()的重要性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Pattern.compile()方法的重要性是什么?

为什么我需要在获取<$ c之前编译正则表达式字符串$ c>匹配器对象?

What is the importance of Pattern.compile() method?
Why do I need to compile the regex string before getting the Matcher object?

例如:

String regex = "((\\S+)\\s*some\\s*";

Pattern pattern = Pattern.compile(regex); // why do I need to compile
Matcher matcher = pattern.matcher(text);


推荐答案

总是在某个时刻调用 compile()方法;这是创建Pattern对象的唯一方法。所以问题是真的,你为什么要把它称为显式?一个原因是你需要一个对Matcher对象的引用,这样你就可以使用它的方法,比如 group(int)检索捕获组的内容。获取Matcher对象的唯一方法是通过Pattern对象的 matcher()方法,以及获取的唯一方法看哪Pattern对象是通过 compile()方法。然后是 find()方法,与 matches()不同,它不会在String或Pattern类中重复。

The compile() method is always called at some point; it's the only way to create a Pattern object. So the question is really, why should you call it explicitly? One reason is that you need a reference to the Matcher object so you can use its methods, like group(int) to retrieve the contents of capturing groups. The only way to get ahold of the Matcher object is through the Pattern object's matcher() method, and the only way to get ahold of the Pattern object is through the compile() method. Then there's the find() method which, unlike matches(), is not duplicated in the String or Pattern classes.

另一个原因是避免一遍又一遍地创建相同的Pattern对象。每次使用String中的一个正则表达式方法(或Pattern中的静态 matches()方法)时,它都会创建一个新的Pattern和一个新的Matcher。所以这段代码:

The other reason is to avoid creating the same Pattern object over and over. Every time you use one of the regex-powered methods in String (or the static matches() method in Pattern), it creates a new Pattern and a new Matcher. So this code snippet:

for (String s : myStringList) {
    if ( s.matches("\\d+") ) {
        doSomething();
    }
}

...与此完全相同:

...is exactly equivalent to this:

for (String s : myStringList) {
    if ( Pattern.compile("\\d+").matcher(s).matches() ) {
        doSomething();
    }
}

显然,这正在做很多不必要的工作。事实上,编译正则表达式并实例化Pattern对象比执行实际匹配要花费更长的时间。因此,将该步骤拉出循环通常是有意义的。你也可以提前创建Matcher,虽然它们不是那么贵:

Obviously, that's doing a lot of unnecessary work. In fact, it can easily take longer to compile the regex and instantiate the Pattern object, than it does to perform an actual match. So it usually makes sense to pull that step out of the loop. You can create the Matcher ahead of time as well, though they're not nearly so expensive:

Pattern p = Pattern.compile("\\d+");
Matcher m = p.matcher("");
for (String s : myStringList) {
    if ( m.reset(s).matches() ) {
        doSomething();
    }
}

如果您熟悉.NET正则表达式,那么你可能想知道Java的 compile()方法是否与.NET的 RegexOptions.Compiled 修饰符相关;答案是不。 Java的 Pattern.compile()方法仅相当于.NET的Regex构造函数。当您指定已编译选项时:

If you're familiar with .NET regexes, you may be wondering if Java's compile() method is related to .NET's RegexOptions.Compiled modifier; the answer is no. Java's Pattern.compile() method is merely equivalent to .NET's Regex constructor. When you specify the Compiled option:

Regex r = new Regex(@"\d+", RegexOptions.Compiled); 

...它将正则表达式直接编译为CIL字节码,使其执行速度更快,但是前期处理和内存使用成本很高 - 将其视为正则表达式的类固醇。 Java没有等价物;通过 String#matches(String)创建的模式与使用模式#cile(String)<显式创建的模式之间没有区别/ code>。

...it compiles the regex directly to CIL byte code, allowing it to perform much faster, but at a significant cost in up-front processing and memory use--think of it as steroids for regexes. Java has no equivalent; there's no difference between a Pattern that's created behind the scenes by String#matches(String) and one you create explicitly with Pattern#compile(String).

(编辑:我最初说所有.NET Regex对象都被缓存,这是不正确的。从.NET 2.0开始,仅发生自动缓存使用静态方法,如 Regex.Matches(),而不是直接调用Regex构造函数。 ref

( I originally said that all .NET Regex objects are cached, which is incorrect. Since .NET 2.0, automatic caching occurs only with static methods like Regex.Matches(), not when you call a Regex constructor directly. ref)

这篇关于java.util.regex - Pattern.compile()的重要性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆