将字节数组编码解码为字符串而不会丢失数据 [英] encoding decoding of byte array to string without data loss

查看:87
本文介绍了将字节数组编码解码为字符串而不会丢失数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试将byte []转换为字符串,如下所示:

  Map< String,String> biomap = new HashMap< String,String>(); 
biomap.put( L1,新的String(Lf1, ISO-8859-1));

其中Lf1是byte []数组,然后我将此字符串转换为byte []:
问题是,当我将字节数组转换为字符串时,它会变成:

  FMRFP d@ 0d@ r( @ .........等

 字符串SF1 = biomap.get( L1); 
字节[]存储L1 = SF1.getBytes( ISO-8859-1)

当我将其转换回字节数组并比较两个数组时,它返回false。 。



我想要与我编码为字符串并将解码器解码为byte []时相同的byte []数据。

解决方案

首先: ISO-8859-1 不会不会导致任何问题如果使用此编码将任意字节数组转换为字符串,则会丢失数据,请考虑以下程序:

 公共类BytesToString {
public static void main(String [] args)引发异常{
//将包含所有可能的字节值
byte [] bytes = new byte [256];
for(int i = 0; i< 256; i ++){
bytes [i] =(byte)(i + Byte.MIN_VALUE);
}

//转换为字符串然后返回字节
String str = new String(bytes, ISO-8859-1);
byte [] newBytes = str.getBytes( ISO-8859-1);

if(newBytes.length!= 256){
抛出new IllegalStateException( Wrong length);
}
boolean mismatchFound = false;
for(int i = 0; i< 256; i ++){
if(newBytes [i]!= bytes [i]){
System.out.println(不匹配: + bytes [i] +-> + newBytes [i]);
mismatchFound = true;
}
}
System.out.println(是否发现不匹配项: + mismatchFound);
}
}

它将构建具有所有可能字节值的字节数组,然后使用 ISO-8859-1 将其转换为 String ,然后使用相同的编码返回字节。 / p>

此程序输出是否发现不匹配:false ,因此通过<$进行字节->字符串->字节转换c $ c> ISO-8859-1 产生的数据与开始时相同。



但是,正如在注释, String 不是二进制数据的好容器。具体来说,这样的字符串几乎肯定会包含不可打印的字符,因此,如果您打印它或尝试通过HTML或其他方式传递它,则会遇到一些问题(例如,数据丢失)。



如果确实需要将字节数组转换为字符串(并且不透明地使用它),请使用 base64 编码:

 字符串stringRepresentation = Base64.getEncoder()。encodeToString(bytes); 
byte [] encodeBytes = Base64.getDecoder()。decode(stringRepresentation);

虽然占用更多空间,但是生成的字符串对于打印是安全的。


I tried to convert byte[] to string as follows:

Map<String, String> biomap = new HashMap<String, String>();
biomap.put("L1", new String(Lf1, "ISO-8859-1"));

where Lf1 is byte[] array and then i convert this string to byte[]: problem is, when i convert byte array to string it comes like:

FMR  F P�d@� �0d@r (@� ......... etc

and

String SF1 = biomap.get("L1");
byte[] storedL1 = SF1.getBytes("ISO-8859-1")

and when i convert back it to byte array and compare both arrays, it return false. I mean Data Changed.

i want same byte[] data as it was when i encoded to string and decodec to byte[]

解决方案

First: ISO-8859-1 does not cause any data loss if an arbitrary byte array is converted to string using this encoding. Consider the following program:

public class BytesToString {
    public static void main(String[] args) throws Exception {
        // array that will contain all the possible byte values
        byte[] bytes = new byte[256];
        for (int i = 0; i < 256; i++) {
            bytes[i] = (byte) (i + Byte.MIN_VALUE);
        }

        // converting to string and back to bytes
        String str = new String(bytes, "ISO-8859-1");
        byte[] newBytes = str.getBytes("ISO-8859-1");

        if (newBytes.length != 256) {
            throw new IllegalStateException("Wrong length");
        }
        boolean mismatchFound = false;
        for (int i = 0; i < 256; i++) {
            if (newBytes[i] != bytes[i]) {
                System.out.println("Mismatch: " + bytes[i] + "->" + newBytes[i]);
                mismatchFound = true;
            }
        }
        System.out.println("Whether a mismatch was found: " + mismatchFound);
    }
}

It builds an array of bytes with all possible byte values, then it converts it to String using ISO-8859-1 and then back to bytes using the same encoding.

This program outputs Whether a mismatch was found: false, so bytes->String->bytes conversion via ISO-8859-1 yields the same data as it was in the beginning.

But, as it was pointed out in the comments, String is not a good container for binary data. Specifically, such a string will almost surely contain unprintable characters, so if you print it or try to pass it via HTML or some other means, you will get some problems (data loss, for example).

If you really need to convert byte array to a string (and use it opaquely), use base64 encoding:

String stringRepresentation = Base64.getEncoder().encodeToString(bytes);
byte[] decodedBytes = Base64.getDecoder().decode(stringRepresentation);

It takes more space, but the resulting string is safe in regard to printing.

这篇关于将字节数组编码解码为字符串而不会丢失数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆