如何使用特定的字符编码在 Java 中读取文件? [英] How to read a file in Java with specific character encoding?

查看:23
本文介绍了如何使用特定的字符编码在 Java 中读取文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试根据此方法的输出以 UTF-8 或 Windows-1252 格式读取文件:

I am trying to read a file in as either UTF-8 or Windows-1252 depending on the output of this method:

public Charset getCorrectCharsetToApply() {
    // Returns a Charset for either UTF-8 or Windows-1252.
}

到目前为止,我已经:

String fileName = getFileNameToReadFromUserInput();
InputStream is = new ByteArrayInputStream(fileName.getBytes());
InputStreamReader isr = new InputStreamReader(is, getCorrectCharsetToApply());
BufferedReader buffReader = new BufferedReader(isr);

我遇到的问题是将 BufferedReader 实例转换为 FileReader.

The problem I'm having is converting the BufferedReader instance to a FileReader.

此外:

  • 文件本身的名称(fileName)不能被信任为特定的Charset;有时文件名将包含 UTF-8 字符,有时包含 Windows-1252.文件内容也是如此(但是,如果文件名和文件内容总是具有匹配的字符集).
  • 只有 getCorrectCharsetToApply() 内部的逻辑可以选择要应用的字符集,因此在调用此方法之前尝试按名称读取文件 很可能会导致, Java 试图读取带有错误编码的文件名...导致它死亡!
  • The name of the file itself (fileName) cannot be trusted to be a particular Charset; sometime the file name will contain UTF-8 characters, and sometimes Windows-1252. Same goes for the file's content (however if file name and file content will always have matching charsets).
  • Only the logic inside getCorrectCharsetToApply() can select the charset to apply, so attempting to read a file by its name prior to calling this method could very well result with, Java trying to read the file name with the wrong encoding...which causes it to die!

提前致谢!

推荐答案

所以,首先,作为一个提醒,一定要意识到 fileName.getBytes() 因为你在那里获得了文件名,而不是文件本身.

So, first, as a heads up, do realize that fileName.getBytes() as you have there gets the bytes of the filename, not the file itself.

第二,阅读FileReader 的文档:

这个类的构造函数假定默认字符编码和默认字节缓冲区大小是合适的.指定这些值你自己,构造一个 InputStreamReader文件输入流.

The constructors of this class assume that the default character encoding and the default byte-buffer size are appropriate. To specify these values yourself, construct an InputStreamReader on a FileInputStream.

所以,听起来 FileReader 实际上不是要走的路.如果我们采纳文档中的建议,那么您应该将代码更改为:

So, sounds like FileReader actually isn't the way to go. If we take the advice in the docs, then you should just change your code to have:

String fileName = getFileNameToReadFromUserInput();
FileInputStream is = new FileInputStream(fileName);
InputStreamReader isr = new InputStreamReader(is, getCorrectCharsetToApply());
BufferedReader buffReader = new BufferedReader(isr);

根本不要尝试制作 FileReader.

and not try to make a FileReader at all.

这篇关于如何使用特定的字符编码在 Java 中读取文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆