“用于编码UTF-8的不可映射字符”错误 [英] "Unmappable character for encoding UTF-8" error

查看:1896
本文介绍了“用于编码UTF-8的不可映射字符”错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  public static boolean isValidPasswd(String passwd){
String reg =^(?=。* [0-9])(?=。* [az])(?=。* [AZ])(?=。* [〜#;:?/ ;!\'%* =¬。, - ])(?= [^ \\s] + $)。{8,24} $;
return Pattern.matches(reg,passwd );
}



 
在Utility.java:[76 ,74]
enoding UTF-8不能映射的字符第74个字符是'''

我该如何解决?谢谢。

解决方案

您的源代码文件有编码问题。这可能是ISO-8859-1编码,但编译器设置为使用UTF-8。当使用UTF-8和ISO-8859-1中不会有相同字节的字符时,会导致错误。所有不属于ASCII的字符都会发生,例如¬ NOT SIGN



您可以使用以下程序进行模拟。它只使用您的源代码行,并生成ISO-8859-1字节数组,并用UTF-8编码解码此错误。您可以看到线路在哪个位置被损坏。我在您的源代码中添加了2个空格以适合第74位,以适应¬ NOT SIGN ,它是唯一的字符,它将在ISO-8859-1编码和UTF-8编码中生成不同的字节。我想这将匹配缩进与真正的源文件。

  String reg =String reg = \^(?= * [0-9])(= * [AZ])(= * [AZ])(= * [〜#;:????!/ @&安培; \'%* =¬ 。, - ])(?= [^ \\s] + $)。{8,24} $ \;; 
String corrupt = new String(reg.getBytes(ISO-8859 -1),$ UTF-8);
System.out.println(corrupt +:+ corrupt.charAt(74));
System.out.println(reg +:+ reg.charAt(74));

导致以下输出(由于标记而弄乱) :


String reg =^(?=。 [0-9])(?=。 AZ])(= [AZ])(= [〜#;:/ @&安培;%* =, - ])(= [^ \\???!? \\ s] + $)。{8,24} $;:



String reg =^(?=。 [0-9]) (?= [AZ]。)(= [AZ]。)(= [〜#;:??!/ @&安培;%* =¬, - ])(?= [^ \s] + $)。{8,24} $;:¬


https://ideone.com/ShZnB 实时



要修复这个,保存源文件使用UTF-8编码。


I'm getting a compile error at the following method.

public static boolean isValidPasswd(String passwd) {
    String reg = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!\"'%*=¬.,-])(?=[^\\s]+$).{8,24}$";
    return Pattern.matches(reg, passwd);
}

at Utility.java:[76,74] unmappable character for 
enoding UTF-8. 74th character is' " '

How can I fix this? Thanks.

解决方案

You have encoding problem with your sourcecode file. It is maybe ISO-8859-1 encoded, but the compiler was set to use UTF-8. This will results in errors when using characters, which will not have the same bytes representation in UTF-8 and ISO-8859-1. This will happen to all characters which are not part of ASCII, for example ¬ NOT SIGN.

You can simulate this with the following program. It just uses your line of source code and generates a ISO-8859-1 byte array and decode this "wrong" with UTF-8 encoding. You can see at which position the line gets corrupted. I added 2 spaces at your source code to fit position 74 to fit this to ¬ NOT SIGN, which is the only character, which will generate different bytes in ISO-8859-1 encoding and UTF-8 encoding. I guess this will match indentation with the real source file.

 String reg = "      String reg = \"^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!\"'%*=¬.,-])(?=[^\\s]+$).{8,24}$\";";
 String corrupt=new String(reg.getBytes("ISO-8859-1"),"UTF-8");
 System.out.println(corrupt+": "+corrupt.charAt(74));
 System.out.println(reg+": "+reg.charAt(74));     

which results in the following output (messed up because of markup):

String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=�.,-])(?=[^\s]+$).{8,24}$";: �

String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";: ¬

See "live" at https://ideone.com/ShZnB

To fix this, save the source files with UTF-8 encoding.

这篇关于“用于编码UTF-8的不可映射字符”错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆