在Java中读取奇怪的unicode字符？ [英] Reading strange unicode character in Java?

查看：178 发布时间：2017/11/4 21:32:48 java unicode file-io

本文介绍了在Java中读取奇怪的unicode字符？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下文本文件：

这个文件是用utf-8编码保存的。

我用下面的代码阅读文件的内容：

FileReader fr = new FileReader（f.txt）; BufferedReader br = new BufferedReader（fr）; String s1 = br.readLine（）; String s2 = br.readLine（）; System.out.println（s1 =+ s1.length（））; System.out.println（s2 =+ s2.length（））;
输出：

s1 = 5 s2 = 4
然后我尝试使用 s1.charAt（0）; 来获取s1的第一个字符，它是''空白）字符。这就是为什么s1长度为5即使我试图使用 s1.trim（）; 它的长度仍然是5.
我不知道为什么会发生？如果文件是用ASCII编码保存的，那么它工作正常。

解决方案
记事本显然保存了字节顺序标记，一个不可打印的字符，刚开始标记为UTF-8，但不是必需的而且确实不推荐）使用。您可以忽略或删除它;其他的文本编辑器通常会给你选择使用UTF-8或不使用BOM。

I have the following text file:

The file was saved with utf-8 encoding.

I used the following code to read the content of the file:
FileReader fr = new FileReader("f.txt"); BufferedReader br = new BufferedReader(fr); String s1 = br.readLine(); String s2 = br.readLine(); System.out.println("s1 = " + s1.length()); System.out.println("s2 = " + s2.length());
the output:
s1 = 5 s2 = 4
Then I tried to use s1.charAt(0); to get the first character of s1 and it was '' (blank) character. That's why s1 has the length of 5. Even if I tried to use s1.trim(); its length still 5. I dont know why that happened? It worked correctly if the file was saved with ASCII encoding.
解决方案
Notepad apparently saved the file with a byte order mark, a nonprintable character at the beginning that just marks it as UTF-8 but is not required (and indeed not recommended) to use. You can ignore or remove it; other text editors often give you the choice of using UTF-8 with or without a BOM.

这篇关于在Java中读取奇怪的unicode字符？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在Java中读取奇怪的unicode字符？ [英] Reading strange unicode character in Java?

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

在Java中读取奇怪的unicode字符？ [英] Reading strange unicode character in Java?

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭