将Shift_JIS格式转换为UTF-8格式 [英] Convert Shift_JIS format to UTF-8 format

查看:4452
本文介绍了将Shift_JIS格式转换为UTF-8格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将Shift_JIS格式的文件转换为UTF-8格式。为此,下面是我的方法:

I am trying to convert a Shift_JIS formatted file into UTF-8 format. For this, below is my approach:


  1. 读取Shift_JIS文件

  2. 每行的getBytes并转换它转换为UTF-8

  3. 创建新文件并将UTF-8转换后的值写入其中

问题是在第2步没有发生转换。我使用下面的代码将Shift_JIS转换为UTF-8:

Issue is that at step 2 conversion is not happening. I am using below code for converting Shift_JIS to UTF-8:

InputStream inputStream = getContentResolver().openInputStream(uri);
BufferedReader reader = new BufferedReader(new InputStreamReader(inputStream));
    byte[] b = line.getBytes("Shift_JIS");
    String value = new String(b, "UTF-8");

如果需要任何其他信息,请告诉我。

Please let me know if any other information is required.

我有2个以下问题

1。还有其他更好的方法(步骤)来进行此转换吗?

2。为什么上面的代码段无法进行转换?

提前致谢!!!

推荐答案

@VicJordan发布的答案不正确。当您调用 getBytes()时,您将获得字符串在系统的本机字符编码下编码的原始字节(可能是也可能不是UTF-8)。然后,您将这些字节视为UTF-8编码,它们可能不是。

The answer @VicJordan posted is not correct. When you call getBytes(), you are getting the raw bytes of the string encoded under your system's native character encoding (which may or may not be UTF-8). Then, you are treating those bytes as if they were encoded in UTF-8, which they might not be.

更可靠的方法是将Shift_JIS文件读入一个Java字符串。然后,使用UTF-8编码写出Java字符串。

A more reliable approach would be to read the Shift_JIS file into a Java String. Then, write out the Java String using UTF-8 encoding.

InputStream in = ...
Reader reader = new InputStreamReader(in, "Shift_JIS");
StringBuilder sb = new StringBuilder();
int read;
while ((read = reader.read()) != -1){
  sb.append((char)read);
}
reader.close();

String string = sb.toString();

OutputStream out = ...
Writer writer = new OutputStreamWriter(out, "UTF-8");
writer.write(string);
writer.close();

这篇关于将Shift_JIS格式转换为UTF-8格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆