“用于编码UTF-8的不可映射字符”错误 [英] "Unmappable character for encoding UTF-8" error

查看：1896 发布时间：2017/8/16 19:25:20 java maven-2 encoding utf-8

本文介绍了“用于编码UTF-8的不可映射字符”错误的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

  public static boolean isValidPasswd（String passwd）{
 String reg =^（？=。* [0-9]）（？=。* [az]）（？=。* [AZ]）（？=。* [〜＃;：？/ ;！\'％* =¬。， - ]）（？= [^ \\s] + $）。{8,24} $; 
 return Pattern.matches（reg，passwd ）; 
}

 
在Utility.java:[76 ，74] 
 enoding UTF-8不能映射的字符第74个字符是'''

我该如何解决？谢谢。

解决方案

您的源代码文件有编码问题。这可能是ISO-8859-1编码，但编译器设置为使用UTF-8。当使用UTF-8和ISO-8859-1中不会有相同字节的字符时，会导致错误。所有不属于ASCII的字符都会发生，例如¬ NOT SIGN 。

您可以使用以下程序进行模拟。它只使用您的源代码行，并生成ISO-8859-1字节数组，并用UTF-8编码解码此错误。您可以看到线路在哪个位置被损坏。我在您的源代码中添加了2个空格以适合第74位，以适应¬ NOT SIGN ，它是唯一的字符，它将在ISO-8859-1编码和UTF-8编码中生成不同的字节。我想这将匹配缩进与真正的源文件。

  String reg =String reg = \^（？= * [0-9]）（= * [AZ]）（= * [AZ]）（= * [〜＃;：？？？？！/ @&安培; \'％* =¬ 。， - ]）（？= [^ \\s] + $）。{8,24} $ \;; 
 String corrupt = new String（reg.getBytes（ISO-8859 -1），$ UTF-8）; 
 System.out.println（corrupt +：+ corrupt.charAt（74））; 
 System.out.println（reg +：+ reg.charAt（74））;

导致以下输出（由于标记而弄乱）：

String reg =^（？=。 [0-9]）（？=。 AZ]）（= [AZ]）（= [〜＃;：/ @&安培;％* =， - ]）（= [^ \\？？？！？ \\ s] + $）。{8,24} $;:

String reg =^（？=。 [0-9]）（？= [AZ]。）（= [AZ]。）（= [〜＃;：？？！/ @&安培;％* =¬， - ]）（？= [^ \s] + $）。{8,24} $;:¬

https://ideone.com/ShZnB 实时

要修复这个，保存源文件使用UTF-8编码。

I'm getting a compile error at the following method.

public static boolean isValidPasswd(String passwd) {
    String reg = "^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!\"'%*=¬.,-])(?=[^\\s]+$).{8,24}$";
    return Pattern.matches(reg, passwd);
}

at Utility.java:[76,74] unmappable character for 
enoding UTF-8. 74th character is' " '

How can I fix this? Thanks.

解决方案

You have encoding problem with your sourcecode file. It is maybe ISO-8859-1 encoded, but the compiler was set to use UTF-8. This will results in errors when using characters, which will not have the same bytes representation in UTF-8 and ISO-8859-1. This will happen to all characters which are not part of ASCII, for example ¬ NOT SIGN.

You can simulate this with the following program. It just uses your line of source code and generates a ISO-8859-1 byte array and decode this "wrong" with UTF-8 encoding. You can see at which position the line gets corrupted. I added 2 spaces at your source code to fit position 74 to fit this to ¬ NOT SIGN, which is the only character, which will generate different bytes in ISO-8859-1 encoding and UTF-8 encoding. I guess this will match indentation with the real source file.

 String reg = "      String reg = \"^(?=.*[0-9])(?=.*[a-z])(?=.*[A-Z])(?=.*[~#;:?/@&!\"'%*=¬.,-])(?=[^\\s]+$).{8,24}$\";";
 String corrupt=new String(reg.getBytes("ISO-8859-1"),"UTF-8");
 System.out.println(corrupt+": "+corrupt.charAt(74));
 System.out.println(reg+": "+reg.charAt(74));

which results in the following output (messed up because of markup):

String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=�.,-])(?=[^\s]+$).{8,24}$";: �

String reg = "^(?=.[0-9])(?=.[a-z])(?=.[A-Z])(?=.[~#;:?/@&!"'%*=¬.,-])(?=[^\s]+$).{8,24}$";: ¬

See "live" at https://ideone.com/ShZnB

To fix this, save the source files with UTF-8 encoding.

这篇关于“用于编码UTF-8的不可映射字符”错误的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

“用于编码UTF-8的不可映射字符”错误 [英] "Unmappable character for encoding UTF-8" error

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

“用于编码UTF-8的不可映射字符”错误 [英] &quot;Unmappable character for encoding UTF-8&quot; error

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

“用于编码UTF-8的不可映射字符”错误 [英] "Unmappable character for encoding UTF-8" error

登录关闭