Byte数组到String并返回..问题-127 [英] Byte array to String and back.. issues with -127

查看:201
本文介绍了Byte数组到String并返回..问题-127的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下内容:

 scala> (new String(Array[Byte](1, 2, 3, -1, -2, -127))).getBytes
 res12: Array[Byte] = Array(1, 2, 3, -1, -2, 63)

为什么-127转换为63?以及如何将其恢复为-127

why is -127 converted to 63? and how do I get it back as -127

以下Java版本(以表明它不仅仅是Scala问题)

Java version below (to show that its not just a "Scala problem")

c:\tmp>type Main.java
public class Main {
    public static void main(String [] args) {
        byte [] b = {1, 2, 3, -1, -2, -127};
        byte [] c = new String(b).getBytes();
        for (int i = 0; i < 6; i++){
            System.out.println("b:"+b[i]+"; c:"+c[i]);
        }
    }
}
c:\tmp>javac Main.java
c:\tmp>java Main
b:1; c:1
b:2; c:2
b:3; c:3
b:-1; c:-1
b:-2; c:-2
b:-127; c:63


推荐答案

您正在调用的构造函数使它成为可能非显而易见的二进制到字符串转换使用解码: String(byte [] bytes,Charset charset)。你想要的是根本不使用解码。

The constructor you're calling makes it non-obvious that binary-to-string conversions use a decoding: String(byte[] bytes, Charset charset). What you want is to use no decoding at all.

幸运的是,有一个构造函数: String(char [] value)

Fortunately, there's a constructor for that: String(char[] value).

现在你有一个字符串中的数据,但你想要它完全恢复原样。但猜猜怎么了! getBytes(Charset charset)这是正确的,还会自动应用编码。幸运的是,有一个 toCharArray()方法。

Now you have the data in a string, but you want it back exactly as is. But guess what! getBytes(Charset charset) That's right, there's an encoding applied automatically also. Fortunately, there is a toCharArray() method.

如果必须以字节开头并以字节结尾,那么你然后必须将char数组映射到字节:

If you must start with bytes and end with bytes, you then have to map the char arrays to bytes:

(new String(Array[Byte](1,2,3,-1,-2,-127).map(_.toChar))).toCharArray.map(_.toByte)

因此,总结一下:在 String 之间进行转换,数组[Byte] 涉及编码和解码。如果要将二进制数据放入字符串中,则必须在字符级别执行此操作。但请注意,这将为您提供一个垃圾字符串(即结果将不是格式良好的UTF-16,因为字符串应该是),所以你最好把它读出来作为字符并将其转换回字节。

So, to summarize: converting between String and Array[Byte] involves encoding and decoding. If you want to put binary data in a string, you have to do it at the level of characters. Note, however, that this will give you a garbage string (i.e. the result will not be well-formed UTF-16, as String is expected to be), and so you'd better read it out as characters and convert it back to bytes.

可以将字节向上移动,比如增加512;然后你会得到一堆有效的单个 Char 代码点。但这是使用16位来表示每8个,50%的编码效率。 Base64是串行化二进制数据的更好选择(8位代表6,75%效率)。

You could shift the bytes up by, say, adding 512; then you'd get a bunch of valid single Char code points. But this is using 16 bits to represent every 8, a 50% encoding efficiency. Base64 is a better option for serializing binary data (8 bits to represent 6, 75% efficient).

这篇关于Byte数组到String并返回..问题-127的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆