阅读UTF-8 - BOM标记 [英] Reading UTF-8 - BOM marker
问题描述
fr = new FileReader(file);
br = new BufferedReader(fr);
String tmp = null;
while((tmp = br.readLine())!= null){
String text;
text = new String(tmp.getBytes(),UTF-8);
content + = text + System.getProperty(line.separator);
}
第一行后输出
<风格>
在Java中,必须手动使用UTF8 BOM如果存在。此行为记录在Java错误数据库中,此处和这里。现在将不会修复,因为它会破坏现有的工具,如JavaDoc或XML解析器。 Apache IO Commons 提供了一个 BOMInputStream
来处理这种情况。
看看这个解决方案:使用BOM处理UTF8文件
I'm reading a file through a FileReader - the file is UTF-8 decoded (with BOM) now my problem is: I read the file and output a string, but sadly the BOM marker is outputted too. Why this occurs?
fr = new FileReader(file);
br = new BufferedReader(fr);
String tmp = null;
while ((tmp = br.readLine()) != null) {
String text;
text = new String(tmp.getBytes(), "UTF-8");
content += text + System.getProperty("line.separator");
}
output after first line
?<style>
In Java, you have to consume manually the UTF8 BOM if present. This behaviour is documented in the Java bug database, here and here. There will be no fix for now because it will break existing tools like JavaDoc or XML parsers. The Apache IO Commons provides a BOMInputStream
to handle this situation.
Take a look at this solution: Handle UTF8 file with BOM
这篇关于阅读UTF-8 - BOM标记的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!