“用于编码 UTF-8 的不可映射字符";错误 [英] "Unmappable character for encoding UTF-8" error
问题描述
我在以下方法中遇到编译错误.
public static boolean isValidPasswd(String passwd) {String reg = "^(?=.*[0-9])(?=.*[az])(?=.*[AZ])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";返回 Pattern.matches(reg, passwd);}
<前>在 Utility.java:[76,74] 不可映射的字符编码 UTF-8.第 74 个字符是 ' " '
我该如何解决这个问题?谢谢.
您的源代码文件存在编码问题.它可能是 ISO-8859-1 编码的,但编译器被设置为使用 UTF-8.这将导致在使用字符时出错,字符在 UTF-8 和 ISO-8859-1 中不会具有相同的字节表示.这将发生在所有不属于 ASCII 的字符上,例如 ¬
不签字.
您可以使用以下程序进行模拟.它只是使用您的源代码行并生成一个 ISO-8859-1 字节数组并使用 UTF-8 编码解码这个错误".您可以看到线路在哪个位置损坏.我在您的源代码中添加了 2 个空格以适应位置 74 以适应 ¬
NOT SIGN,这是唯一一个字符,它会在 ISO-8859-1 编码和 UTF-8 编码中生成不同的字节.我想这将与实际源文件的缩进相匹配.
String reg = " String reg = "^(?=.*[0-9])(?=.*[az])(?=.*[AZ])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";";字符串损坏=新字符串(reg.getBytes(ISO-8859-1"),UTF-8");System.out.println(corrupt+": "+corrupt.charAt(74));System.out.println(reg+": "+reg.charAt(74));
导致以下输出(由于标记而混乱):
<块引用>String reg = "^(?=.[0-9])(?=.[az])(?=.[AZ])(?=.[~#;:?/@&!"'%*= .,-])(?=[^s]+$).{8,24}$";:
String reg = "^(?=.[0-9])(?=.[az])(?=.[AZ])(?=.[~#;:?/@&!"'%*=¬.,-])(?=[^s]+$).{8,24}$";: ¬
上观看直播"要解决此问题,请使用 UTF-8 编码保存源文件.
I'm getting a compile error at the following method.
public static boolean isValidPasswd(String passwd) {
String reg = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";
return Pattern.matches(reg, passwd);
}
at Utility.java:[76,74] unmappable character for enoding UTF-8. 74th character is' " '
How can I fix this? Thanks.
You have encoding problem with your sourcecode file. It is maybe ISO-8859-1 encoded, but the compiler was set to use UTF-8. This will results in errors when using characters, which will not have the same bytes representation in UTF-8 and ISO-8859-1. This will happen to all characters which are not part of ASCII, for example ¬
NOT SIGN.
You can simulate this with the following program. It just uses your line of source code and generates a ISO-8859-1 byte array and decode this "wrong" with UTF-8 encoding. You can see at which position the line gets corrupted. I added 2 spaces at your source code to fit position 74 to fit this to ¬
NOT SIGN, which is the only character, which will generate different bytes in ISO-8859-1 encoding and UTF-8 encoding. I guess this will match indentation with the real source file.
String reg = " String reg = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";";
String corrupt=new String(reg.getBytes("ISO-8859-1"),"UTF-8");
System.out.println(corrupt+": "+corrupt.charAt(74));
System.out.println(reg+": "+reg.charAt(74));
which results in the following output (messed up because of markup):
String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=�.,-])(?=[^s]+$).{8,24}$";: �
String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=¬.,-])(?=[^s]+$).{8,24}$";: ¬
See "live" at https://ideone.com/ShZnB
To fix this, save the source files with UTF-8 encoding.
这篇关于“用于编码 UTF-8 的不可映射字符";错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!