在java中读unicode字符 [英] Reading unicode character in java

查看：183 发布时间：2017/10/26 21:20:39 java file unicode

本文介绍了在java中读unicode字符的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

  String str =\\\ło我对Java有点新奇，当我分配一个unicode字符串\\\ży\\\ł\" ; 
 System.out.println（str）; 
 
 final StringBuilder stringBuilder = new StringBuilder（）; 
 InputStream inStream = new FileInputStream（C：/a.txt）; 
 final InputStreamReader streamReader = new InputStreamReader（inStream，UTF-8）; 
 final BufferedReader bufferedReader = new BufferedReader（streamReader）; 
 String line =; 
 while（（line = bufferedReader.readLine（））！= null）{
 System.out.println（line）; 
 stringBuilder.append（line）; 
}

为什么两种情况下的结果都不同，文件a.txt也包含相同的字符串但是当我打印文件的输出时，它打印 z\\\ło\\\ży\\\ł 而不是实际的unicode字符。任何想法，如果我想要将文件内容也打印成正在打印的字符串，我该怎么做。

解决方案

你的代码应该是正确的，但我猜，文件a.txt不包含使用UTF-8编码的Unicode字符，而是转义的字符串\\\ło\\\ży\\\ł。

请使用UTF-8感知编辑器检查文本文件是否正确，如Windows上最新版本的Notepad或Notepad ++。或者用你喜欢的十六进制编辑器编辑它 - 它不应该包含反斜杠。

我用€作为文件的UTF-8编码内容，它得到打印正确。请注意，根据您的终端编码（Windows上真的很麻烦）和字体，并不是所有Unicode字符都可以打印。

I'm a bit new to java, When I assign a unicode string to

  String str = "\u0142o\u017Cy\u0142";
  System.out.println(str);

  final StringBuilder stringBuilder = new StringBuilder();
  InputStream inStream = new FileInputStream("C:/a.txt");
  final InputStreamReader streamReader = new InputStreamReader(inStream, "UTF-8");
  final BufferedReader bufferedReader = new BufferedReader(streamReader);
  String line = "";
  while ((line = bufferedReader.readLine()) != null) {
      System.out.println(line);
      stringBuilder.append(line);
  }

Why are the results different in both cases the file a.txt also contains the same string. but when i print output of the file it prints z\u0142o\u017Cy\u0142 instead of the actual unicode characters. Any idea how do i do this if i want to file content also to be printed as string is being printed.

解决方案

Your code should be correct, but I guess that the file "a.txt" does not contain the Unicode characters encoded with UTF-8, but the escaped string "\u0142o\u017Cy\u0142".

Please check if the text file is correct, using an UTF-8 aware editor such as recent versions of Notepad or Notepad++ on Windows. Or edit it with your favorite hex editor - it should not contain backslashes.

I tried it with "€" as UTF-8-encoded content of the file and it gets printed correctly. Note that not all Unicode characters can be printed, depending on your terminal encoding (really a hassle on Windows) and font.

这篇关于在java中读unicode字符的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在java中读unicode字符 [英] Reading unicode character in java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

在java中读unicode字符 [英] Reading unicode character in java

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭