Java不区分大小写的正则表达式匹配不适用于字母Ñ [英] Java case-insensitive regex matching doesn't work with letter Ñ

查看:56
本文介绍了Java不区分大小写的正则表达式匹配不适用于字母Ñ的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑此程序:

import java.util.regex.Pattern;
public class xx {

    /*
     *  Ñ
     *  LATIN CAPITAL LETTER N WITH TILDE
     *  Unicode: U+00D1, UTF-8: C3 91
     */
    public static final String BIG_N = "\u00d1";

    /*
     *  ñ
     *  LATIN SMALL LETTER N WITH TILDE
     *  Unicode: U+00F1, UTF-8: C3 B1
     */
    public static final String LITTLE_N = "\u00f1";

    public static void main(String[] args) throws Exception {
        System.out.println(BIG_N.equalsIgnoreCase(LITTLE_N));
        System.out.println(Pattern.compile(BIG_N, Pattern.CASE_INSENSITIVE).matcher(LITTLE_N).matches());
    }
}

由于Ñ是ñ的大写版本,因此您希望它能够打印:

Since Ñ is the upper-case version of ñ, you would expect it to print:

true
true

但是它实际打印的内容(java 1.7.0_17-b02)是:

but what it actually prints (java 1.7.0_17-b02) is:

true
false

为什么?

推荐答案

默认情况下,不区分大小写的匹配假定只匹配US-ASCII字符集中的字符.可以通过将UNICODE_CASE标志与该标志一起指定来启用对Unicode敏感的不区分大小写的匹配.

By default, case-insensitive matching assumes that only characters in the US-ASCII charset are being matched. Unicode-aware case-insensitive matching can be enabled by specifying the UNICODE_CASE flag in conjunction with this flag.

http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#CASE_INSENSITIVE

为了完整性;您或( | )一起标记.

And for completeness; you or (|) the flags together.

Pattern.compile(BIG_N, Pattern.CASE_INSENSITIVE | Pattern.UNICODE_CASE)

这篇关于Java不区分大小写的正则表达式匹配不适用于字母Ñ的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆