将Shift_JIS格式转换为UTF-8格式 [英] Convert Shift_JIS format to UTF-8 format
问题描述
我正在尝试将Shift_JIS格式的文件转换为UTF-8格式。为此,下面是我的方法:
I am trying to convert a Shift_JIS formatted file into UTF-8 format. For this, below is my approach:
- 读取Shift_JIS文件
- 每行的getBytes并转换它转换为UTF-8
- 创建新文件并将UTF-8转换后的值写入其中
问题是在第2步没有发生转换。我使用下面的代码将Shift_JIS转换为UTF-8:
Issue is that at step 2 conversion is not happening. I am using below code for converting Shift_JIS to UTF-8:
InputStream inputStream = getContentResolver().openInputStream(uri);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
byte[] b = line.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");
如果需要任何其他信息,请告诉我。
Please let me know if any other information is required.
我有2个以下问题:
1。还有其他更好的方法(步骤)来进行此转换吗?
2。为什么上面的代码段无法进行转换?
提前致谢!!!
推荐答案
@VicJordan发布的答案不正确。当您调用 getBytes()
时,您将获得字符串在系统的本机字符编码下编码的原始字节(可能是也可能不是UTF-8)。然后,您将这些字节视为UTF-8编码,它们可能不是。
The answer @VicJordan posted is not correct. When you call getBytes()
, you are getting the raw bytes of the string encoded under your system's native character encoding (which may or may not be UTF-8). Then, you are treating those bytes as if they were encoded in UTF-8, which they might not be.
更可靠的方法是将Shift_JIS文件读入一个Java字符串。然后,使用UTF-8编码写出Java字符串。
A more reliable approach would be to read the Shift_JIS file into a Java String. Then, write out the Java String using UTF-8 encoding.
InputStream in = ...
Reader reader = new InputStreamReader(in, "Shift_JIS");
StringBuilder sb = new StringBuilder();
int read;
while ((read = reader.read()) != -1){
sb.append((char)read);
}
reader.close();
String string = sb.toString();
OutputStream out = ...
Writer writer = new OutputStreamWriter(out, "UTF-8");
writer.write(string);
writer.close();
这篇关于将Shift_JIS格式转换为UTF-8格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!