Java Unicode到可读文本转换的解码 [英] Java Unicode to readable text conversion decoding

查看:624
本文介绍了Java Unicode到可读文本转换的解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在开发使用Web服务的Java应用程序.该Web服务是使用SAP服务器创建的,该服务器自动将数据编码为Unicode.我从Web服务获得了Unicode字符串.

I am developing a Java application where I am consuming a web service. The web service is created using a SAP server, which encodes the data automatically in Unicode. I get a Unicode string from the web service.

" 倥䙄ㄭ㌮਍쿣ී㈊〠漠颚骨਍圯湩湁楳湅潣楤杮਍湥润子宫਍‰扯൪㰊഼┊敄瑶灹⁥佐呓′†䘠汤⁴佃剕䕉⁒渠牯慭慌杮䔠ൎ⼊祔数⼠潆瑮਍汇扵祴数⼠祔数റ⼊慂敳潆瑮⼠潃牵敩൲⼊慎敭⼠う㄰਍䔯据挚湩′‰൒㸊ാ攊摮扯൪㐊〠漠别名਍㰼਍䰯湥瑧‵‰൒㸊ാ猊牴慥൭䘯〰‱⸱2 "

" 倥䙄ㄭ㌮਍쿣ී㈊〠漠橢਍圯湩湁楳湅潣楤杮਍湥潤橢਍″‰扯൪㰊഼┊敄瑶灹⁥佐呓′†䘠湯⁴佃剕䕉⁒渠牯慭慌杮䔠ൎ⼊祔数⼠潆瑮਍匯扵祴数⼠祔数റ⼊慂敳潆瑮⼠潃牵敩൲⼊慎敭⼠う㄰਍䔯据摯湩⁧′‰൒㸊ാ攊摮扯൪㐊〠漠橢਍㰼਍䰯湥瑧⁨‵‰൒㸊ാ猊牴慥൭ 䘯〰‱⸱2 "

上面是响应.

我想将其转换为可读的文本格式,例如String.我正在使用Java核心.

I want to convert it to readable text format like String. I am using core Java.

推荐答案

如果您具有byte[]InputStream(均为二进制数据),则可以使用以下命令获取StringReader(均为文本) :

If you have byte[] or an InputStream (both binary data) you can get a String or a Reader (both text) with:

final String encoding = "UTF-8"; // "UTF16LE" or "UTF-16BE"

byte[] b = ...;
String s = new String(b, encoding);

InputStream is = ...;
BufferedReader reader = new BufferedReader(new InputStreamReader(is, encoding));
for (;;) {
    String line = reader.readLine();
}

反向过程使用:

byte[] b = s.geBytes(encoding);
OutputStream os = ...;

BufferedWriter writer = new BufferedWriter(new OuputStreamWriter(os, encoding));
writer.println(s);

Unicode是所有字符的编号系统. UTF变体将Unicode实现为字节.

Unicode is a numbering system for all characters. The UTF variants implement Unicode as bytes.

您的问题:

Your problem:

通常情况下(Web服务),您将已经收到String.例如,您可以使用上面的Writer将该字符串写入文件.可以使用完整Unicode字体自己检查它,或者将文件传递以进行检查.

In normal ways (web service), you would already have received a String. You could write that string to a file using the Writer above for instance. Either to check it yourself with a full Unicode font, or to pass the file on for a check.

您需要(?)检查文本所在的UTF变体.对于亚洲文字,UTF-16(小端或大端)是最佳的.在XML中已经定义了.

You need (?) to check, which UTF variant the text is in. For Asiatic scripts UTF-16 (little endian or big endian) are optimal. In XML it would be defined already.

添加:

Addition:

FileWriter 使用以下命令写入文件默认编码(来自您计算机上的操作系统).而是使用:

FileWriter writes to a file using the default encoding (from operating system on your machine). Instead use:

new OutputStreamWriter(new FileOutputStream(new File("...")), "UTF-8")

如果它是二进制PDF(如@bobince所述),则仅在byte []或InputStream上使用FileOutputStream.

If it is a binary PDF, as @bobince said, use just a FileOutputStream on byte[] or InputStream.

这篇关于Java Unicode到可读文本转换的解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆