Java Regexp:UNGREEDY标志 [英] Java Regexp: UNGREEDY flag

查看:96
本文介绍了Java Regexp:UNGREEDY标志的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将通用文本处理工具 Texy!从PHP移植到Java

I'd like to port a generic text processing tool, Texy!, from PHP to Java.

此工具使用preg_match_all("/.../U")进行不匹配的匹配. 所以我正在寻找一个带有UNGREEDY标志的库.

This tool does ungreedy matching everywhere, using preg_match_all("/.../U"). So I am looking for a library, which has some UNGREEDY flag.

知道我可以使用.*?语法,但是确实有很多正则表达式需要覆盖,并使用每个更新的版本进行检查.

I know I could use the .*? syntax, but there are really many regular expressions I would have to overwrite, and check them with every updated version.

我已经检查了

  • ORO-似乎被遗弃了
  • 雅加达正则表达式-不支持
  • java.util.regex-不支持

有没有这样的图书馆?

谢谢,恩德拉

推荐答案

我建议您创建自己的修改后的Java库.只需将java.util.regex源复制到您自己的包中.

I suggest you create your own modified Java library. Simply copy the java.util.regex source into your own package.

Sun JDK 1.6 Pattern.java类提供了以下默认标志:

The Sun JDK 1.6 Pattern.java class offers these default flags:

static final int GREEDY     = 0;

static final int LAZY       = 1;

static final int POSSESSIVE = 2;

您会注意到这些标志仅使用了两次,修改起来很简单.请看以下示例:

You'll notice that these flags are only used a couple of times, and it would be trivial to modify. Take the following example:

    case '*':
        ch = next();
        if (ch == '?') {
            next();
            return new Curly(prev, 0, MAX_REPS, LAZY);
        } else if (ch == '+') {
            next();
            return new Curly(prev, 0, MAX_REPS, POSSESSIVE);
        }
        return new Curly(prev, 0, MAX_REPS, GREEDY);

只需将最后一行更改为使用"LAZY"标志而不是"GREEDY"标志即可.由于您希望正则表达式库的行为类似于PHP,因此这可能是最好的选择.

Simply change the last line to use the 'LAZY' flag instead of the GREEDY flag. Since your wanting a regex library to behave like the PHP one, this might be the best way to go.

这篇关于Java Regexp:UNGREEDY标志的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆