-*在正则表达式中是否有特殊含义? [英] Does -* have any special meaning in regular expression?

查看:315
本文介绍了-*在正则表达式中是否有特殊含义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的字符串为:

String str = Hello+Bye-see*Go/ok

现在,我想根据 + - * /进行拆分.所以我做到了:

Now, I wanted to split based on +, -, * and /. So I did:

str.split("[+-*/]");

但这失败了,抛出一个错误:

But this failed, throwing an error:

Exception in thread "main" java.util.regex.PatternSyntaxException: Illegal character range near index 3
[+-*/]
   ^
    at java.util.regex.Pattern.error(Pattern.java:1924)
    at java.util.regex.Pattern.range(Pattern.java:2594)
    at java.util.regex.Pattern.clazz(Pattern.java:2507)
    at java.util.regex.Pattern.sequence(Pattern.java:2030)
    at java.util.regex.Pattern.expr(Pattern.java:1964)
    at java.util.regex.Pattern.compile(Pattern.java:1665)
    at java.util.regex.Pattern.<init>(Pattern.java:1337)
    at java.util.regex.Pattern.compile(Pattern.java:1022)
    at java.lang.String.split(String.java:2313)
    at java.lang.String.split(String.java:2355)

然后我将正则表达式更改为:

Then I changed the regex to:

str.split("[-+*/]");

它工作得很好!所以我想知道-* 是否有特殊含义?我在正则表达式 [+-*/] 中做错了什么?

And it works perfectly fine ! So I was wondering if -* has any special meaning? What did I do wrong in the regex [+-*/]?

推荐答案

A.错误在哪里?

问题不是-* .问题在于,在 [character class] 中,连字符-具有特殊含义.例如,在 [a-z] 中,它表示从a到z的所有字符.因此,当您的字符类中包含 +-* 时,我们正在寻找的字符范围从 + (ASCII 43)到 * (ASCII42).无效,因此出错.从技术上讲,正如@Pshemo在注释中所写,Java不使用基于ASCII而是基于Unicode表的字符索引.但是由于前128个ASCII字符点在Unicode中是相同的,因此结果是相同的.

The problem is not -*. The problem is that in a [character class], the hyphen - has special meaning. For instance, in [a-z], it means all characters ranging from a to z. Therefore, when you have +-* in your character class, we are looking for characters ranging from + (ASCII 43) to * (ASCII 42). Not valid, hence the error. Technically, as @Pshemo writes in a comment, Java doesn't use indexes of characters based on ASCII but based on Unicode Table. But since the 128 first ASCII character points are the same in Unicode, the result is the same.

您需要像这样 \-那样逃避连字符,或者如您所见,将-放在类的前面(或后面),其中它不表示字符范围:

You need to either escape the hyphen like so \- or, as you have observed, throw the - at the front (or back) of your class, where it does not indicate a character range:

[-+*/]

因此,将其拆分(使用"at at the back"版本作为变体):

Therefore, in a split (using the "at the back" version for variety):

String[] result = your_original_string.split("[+*/-]");

B.但是 [*-+] 是有效的!!!(ASCII 42到43)

B. But [*-+] would be valid!!! (ASCII 42 to 43)

如果反转 + * ,则具有有效的ASCII范围(42到43).当然,这样做是没有意义的,因为(i)中间没有字符,并且(ii)会使我的狗迷惑.

If you reverse the + and the *, you have a valid ASCII range (42 to 43). Of course there's no point doing so, since (i) there are no characters in the middle and (ii) that would confuse my dog.

C.-* 有特殊含义吗?

C. Does -* have special meaning?

可以,但是不能在角色类中.在字符类之外,这意味着匹配连字符零次或多次.

It does, but not in a character class. Outside a character class, that means match a hyphen, zero or more times.

这篇关于-*在正则表达式中是否有特殊含义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆