Java Unicode到可读文本转换的解码 [英] Java Unicode to readable text conversion decoding
问题描述
我正在开发使用Web服务的Java应用程序.该Web服务是使用SAP服务器创建的,该服务器自动将数据编码为Unicode.我从Web服务获得了Unicode字符串.
I am developing a Java application where I am consuming a web service. The web service is created using a SAP server, which encodes the data automatically in Unicode. I get a Unicode string from the web service.
" 倥䙄ㄭ㌮쿣ී㈊〠漠颚骨圯湩湁楳湅潣楤杮湥润子宫‰扯൪㰊഼┊敄瑶灹佐呓′†䘠汤⁴佃剕䕉⁒渠牯慭慌杮䔠ൎ⼊祔数⼠潆瑮汇扵祴数⼠祔数റ⼊慂敳潆瑮⼠潃牵敩൲⼊慎敭⼠う䔯据挚湩′‰㸊ാ攊摮扯൪㐊〠漠别名㰼䰯湥瑧‵‰㸊ാ猊牴慥൭䘯〰‱⸱2 "
" 倥䙄ㄭ㌮쿣ී㈊〠漠橢圯湩湁楳湅潣楤杮湥潤橢″‰扯൪㰊഼┊敄瑶灹佐呓′†䘠湯⁴佃剕䕉⁒渠牯慭慌杮䔠ൎ⼊祔数⼠潆瑮匯扵祴数⼠祔数റ⼊慂敳潆瑮⼠潃牵敩൲⼊慎敭⼠う䔯据摯湩′‰㸊ാ攊摮扯൪㐊〠漠橢㰼䰯湥瑧‵‰㸊ാ猊牴慥൭ 䘯〰‱⸱2 "
上面是响应.
我想将其转换为可读的文本格式,例如String.我正在使用Java核心.
I want to convert it to readable text format like String. I am using core Java.
推荐答案
如果您具有byte[]
或InputStream
(均为二进制数据),则可以使用以下命令获取String
或Reader
(均为文本) :
If you have byte[]
or an InputStream
(both binary data) you can get a String
or a Reader
(both text) with:
final String encoding = "UTF-8"; // "UTF16LE" or "UTF-16BE"
byte[] b = ...;
String s = new String(b, encoding);
InputStream is = ...;
BufferedReader reader = new BufferedReader(new InputStreamReader(is, encoding));
for (;;) {
String line = reader.readLine();
}
反向过程使用:
byte[] b = s.geBytes(encoding);
OutputStream os = ...;
BufferedWriter writer = new BufferedWriter(new OuputStreamWriter(os, encoding));
writer.println(s);
Unicode是所有字符的编号系统. UTF变体将Unicode实现为字节.
Unicode is a numbering system for all characters. The UTF variants implement Unicode as bytes.
您的问题:
Your problem:
通常情况下(Web服务),您将已经收到String
.例如,您可以使用上面的Writer将该字符串写入文件.可以使用完整Unicode字体自己检查它,或者将文件传递以进行检查.
In normal ways (web service), you would already have received a String
. You could write that string to a file using the Writer above for instance. Either to check it yourself with a full Unicode font, or to pass the file on for a check.
您需要(?)检查文本所在的UTF变体.对于亚洲文字,UTF-16(小端或大端)是最佳的.在XML中已经定义了.
You need (?) to check, which UTF variant the text is in. For Asiatic scripts UTF-16 (little endian or big endian) are optimal. In XML it would be defined already.
添加:
Addition:
FileWriter 使用以下命令写入文件默认编码(来自您计算机上的操作系统).而是使用:
FileWriter writes to a file using the default encoding (from operating system on your machine). Instead use:
new OutputStreamWriter(new FileOutputStream(new File("...")), "UTF-8")
如果它是二进制PDF(如@bobince所述),则仅在byte []或InputStream上使用FileOutputStream.
If it is a binary PDF, as @bobince said, use just a FileOutputStream on byte[] or InputStream.
这篇关于Java Unicode到可读文本转换的解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!