“用于编码 UTF-8 的不可映射字符";错误 [英] "Unmappable character for encoding UTF-8" error

查看:50
本文介绍了“用于编码 UTF-8 的不可映射字符";错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在以下方法中遇到编译错误.

public static boolean isValidPasswd(String passwd) {String reg = "^(?=.*[0-9])(?=.*[az])(?=.*[AZ])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";返回 Pattern.matches(reg, passwd);}

<前>在 Utility.java:[76,74] 不可映射的字符编码 UTF-8.第 74 个字符是 ' " '

我该如何解决这个问题?谢谢.

解决方案

您的源代码文件存在编码问题.它可能是 ISO-8859-1 编码的,但编译器被设置为使用 UTF-8.这将导致在使用字符时出错,字符在 UTF-8 和 ISO-8859-1 中不会具有相同的字节表示.这将发生在所有不属于 ASCII 的字符上,例如 ¬ 不签字.

您可以使用以下程序进行模拟.它只是使用您的源代码行并生成一个 ISO-8859-1 字节数组并使用 UTF-8 编码解码这个错误".您可以看到线路在哪个位置损坏.我在您的源代码中添加了 2 个空格以适应位置 74 以适应 ¬ NOT SIGN,这是唯一一个字符,它会在 ISO-8859-1 编码和 UTF-8 编码中生成不同的字节.我想这将与实际源文件的缩进相匹配.

 String reg = " String reg = "^(?=.*[0-9])(?=.*[az])(?=.*[AZ])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";";字符串损坏=新字符串(reg.getBytes(ISO-8859-1"),UTF-8");System.out.println(corrupt+": "+corrupt.charAt(74));System.out.println(reg+": "+reg.charAt(74));

导致以下输出(由于标记而混乱):

<块引用>

String reg = "^(?=.[0-9])(?=.[az])(?=.[AZ])(?=.[~#;:?/@&!"'%*= .,-])(?=[^s]+$).{8,24}$";:

String reg = "^(?=.[0-9])(?=.[az])(?=.[AZ])(?=.[~#;:?/@&!"'%*=¬.,-])(?=[^s]+$).{8,24}$";: ¬

https://ideone.com/ShZnB

上观看直播"

要解决此问题,请使用 UTF-8 编码保存源文件.

I'm getting a compile error at the following method.

public static boolean isValidPasswd(String passwd) {
    String reg = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";
    return Pattern.matches(reg, passwd);
}

at Utility.java:[76,74] unmappable character for 
enoding UTF-8. 74th character is' " '

How can I fix this? Thanks.

解决方案

You have encoding problem with your sourcecode file. It is maybe ISO-8859-1 encoded, but the compiler was set to use UTF-8. This will results in errors when using characters, which will not have the same bytes representation in UTF-8 and ISO-8859-1. This will happen to all characters which are not part of ASCII, for example ¬ NOT SIGN.

You can simulate this with the following program. It just uses your line of source code and generates a ISO-8859-1 byte array and decode this "wrong" with UTF-8 encoding. You can see at which position the line gets corrupted. I added 2 spaces at your source code to fit position 74 to fit this to ¬ NOT SIGN, which is the only character, which will generate different bytes in ISO-8859-1 encoding and UTF-8 encoding. I guess this will match indentation with the real source file.

 String reg = "      String reg = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";";
 String corrupt=new String(reg.getBytes("ISO-8859-1"),"UTF-8");
 System.out.println(corrupt+": "+corrupt.charAt(74));
 System.out.println(reg+": "+reg.charAt(74));     

which results in the following output (messed up because of markup):

String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=�.,-])(?=[^s]+$).{8,24}$";: �

String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=¬.,-])(?=[^s]+$).{8,24}$";: ¬

See "live" at https://ideone.com/ShZnB

To fix this, save the source files with UTF-8 encoding.

这篇关于“用于编码 UTF-8 的不可映射字符";错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆