从文本文件读取的第一个字符: [英] First character of the reading from the text file : 

查看:38
本文介绍了从文本文件读取的第一个字符:的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我写了这段代码,我会得到它作为输出 --> 首先是:然后是其他行

If I write this code, I get this as output --> This first:  and then the other lines

try {
    BufferedReader br = new BufferedReader(new FileReader(
            "myFile.txt"));

    String line;
    while (line = br.readLine() != null) {
        System.out.println(line);
    }
    br.close();

} catch (FileNotFoundException e) {
    e.printStackTrace();
} catch (IOException e) {
    e.printStackTrace();
}

我该如何避免?

推荐答案

你在第一行得到字符  因为这个序列是 UTF-8 字节顺序标记 (BOM).如果文本文件以 BOM 开头,则很可能是由记事本等 Windows 程序生成的.

You are getting the characters  on the first line because this sequence is the UTF-8 byte order mark (BOM). If a text file begins with a BOM, it's likely it was generated by a Windows program like Notepad.

为了解决您的问题,我们选择将文件明确读取为 UTF-8,而不是任何默认的系统字符编码(US-ASCII 等):

To solve your problem, we choose to read the file explicitly as UTF-8, instead of whatever default system character encoding (US-ASCII, etc.):

BufferedReader in = new BufferedReader(
    new InputStreamReader(
        new FileInputStream("myFile.txt"),
        "UTF-8"));

然后在 UTF-8 中,字节序列  解码为一个字符,即 U+FEFF.此字符是可选的 - 合法的 UTF-8 文件可能以也可能不以它开头.所以只有当它是 U+FEFF 时我们才会跳过第一个字符:

Then in UTF-8, the byte sequence  decodes to one character, which is U+FEFF. This character is optional - a legal UTF-8 file may or may not begin with it. So we will skip the first character only if it's U+FEFF:

in.mark(1);
if (in.read() != 0xFEFF)
  in.reset();

现在您可以继续执行其余代码.

And now you can continue with the rest of your code.

这篇关于从文本文件读取的第一个字符:的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆