阅读UTF-8 - BOM标记 [英] Reading UTF-8 - BOM marker

查看:160
本文介绍了阅读UTF-8 - BOM标记的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在通过一个FileReader读取文件 - 文件是UTF-8解码(使用BOM)现在我的问题是:我读取文件并输出一个字符串,但可惜的是,输出的BOM标记也是。为什么会发生这种情况?

  fr = new FileReader(file); 
br = new BufferedReader(fr);
String tmp = null;
while((tmp = br.readLine())!= null){
String text;
text = new String(tmp.getBytes(),UTF-8);
content + = text + System.getProperty(line.separator);
}

第一行后输出

 <风格> 


解决方案

在Java中,必须手动使用UTF8 BOM如果存在。此行为记录在Java错误数据库中,此处这里。现在将不会修复,因为它会破坏现有的工具,如JavaDoc或XML解析器。 Apache IO Commons 提供了一个 BOMInputStream 来处理这种情况。



看看这个解决方案:使用BOM处理UTF8文件


I'm reading a file through a FileReader - the file is UTF-8 decoded (with BOM) now my problem is: I read the file and output a string, but sadly the BOM marker is outputted too. Why this occurs?

fr = new FileReader(file);
br = new BufferedReader(fr);
    String tmp = null;
    while ((tmp = br.readLine()) != null) {
    String text;    
    text = new String(tmp.getBytes(), "UTF-8");
    content += text + System.getProperty("line.separator");
}

output after first line

?<style>

解决方案

In Java, you have to consume manually the UTF8 BOM if present. This behaviour is documented in the Java bug database, here and here. There will be no fix for now because it will break existing tools like JavaDoc or XML parsers. The Apache IO Commons provides a BOMInputStream to handle this situation.

Take a look at this solution: Handle UTF8 file with BOM

这篇关于阅读UTF-8 - BOM标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆