为什么新的String(bytes,enc).getBytes(enc)不返回原始字节数组? [英] Why new String(bytes, enc).getBytes(enc) does not return the original byte array?
问题描述
我做了以下模拟:
byte[] b = new byte[256];
for (int i = 0; i < 256; i ++) {
b[i] = (byte) (i - 128);
}
byte[] transformed = new String(b, "cp1251").getBytes("cp1251");
for (int i = 0; i < b.length; i ++) {
if (b[i] != transformed[i]) {
System.out.println("Wrong : " + i);
}
}
cp1251
这只输出一个错误的字节 - 位置25.
KOI8-R
- 一切正常。
对于 cp1252
- 4或5个差异。
For cp1251
this outputs only one wrong byte - at position 25.
For KOI8-R
- all fine.
For cp1252
- 4 or 5 differences.
这是什么原因以及如何克服这个?
What is the reason for this and how can this be overcome?
我知道错误将字节数组表示为任何编码的字符串,但它是支付提供商协议的要求,所以我没有选择。
I know it is wrong to represent byte arrays as strings in whatever encoding, but it is a requirement of the protocol of a payment provider, so I don't have a choice.
更新代表 ISO-8859-1
有效,我将它用于 byte []
部分, cp1251
用于文本部分,所以问题仍然只是出于好奇
Update: representing it in ISO-8859-1
works, and I'll use it for the byte[]
part, and cp1251
for the textual part, so the question remains only out of curiousity
推荐答案
目标集中不支持某些字节 - 它们是替换为?
字符。转换回来时,?
通常会转换为字节值63 - 这与以前不同。
Some of the "bytes" are not supported in the target set - they are replaced with the ?
character. When you convert back, ?
is normally converted to the byte value 63 - which isn't what it was before.
这篇关于为什么新的String(bytes,enc).getBytes(enc)不返回原始字节数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!