将字节数组转换为字符串再转换回字节数组的问题 [英] Problems converting byte array to string and back to byte array

查看:66
本文介绍了将字节数组转换为字符串再转换回字节数组的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个话题有很多问题,同样的解决方案,但这对我不起作用.我有一个简单的加密测试.加密/解密本身有效(只要我用字节数组本身而不是字符串来处理这个测试).问题是不想将其作为字节数组处理,而是作为字符串处理,但是当我将字节数组编码为字符串并返回时,生成的字节数组与原始字节数组不同,因此解密不再起作用.我在相应的字符串方法中尝试了以下参数:UTF-8、UTF8、UTF-16、UTF8.他们都没有工作.生成的字节数组与原始字节数组不同.任何想法为什么会这样?

There are a lot of questions with this topic, the same solution, but this doesn't work for me. I have a simple test with an encryption. The encryption/decryption itself works (as long as I handle this test with the byte array itself and not as Strings). The problem is that don't want to handle it as byte array but as String, but when I encode the byte array to string and back, the resulting byte array differs from the original byte array, so the decryption doesn't work anymore. I tried the following parameters in the corresponding string methods: UTF-8, UTF8, UTF-16, UTF8. None of them work. The resulting byte array differs from the original. Any ideas why this is so?

加密器:

public class NewEncrypter
{
    private String algorithm = "DESede";
    private Key key = null;
    private Cipher cipher = null;

    public NewEncrypter() throws NoSuchAlgorithmException, NoSuchPaddingException
    {
         key = KeyGenerator.getInstance(algorithm).generateKey();
         cipher = Cipher.getInstance(algorithm);
    }

    public byte[] encrypt(String input) throws Exception
    {
        cipher.init(Cipher.ENCRYPT_MODE, key);
        byte[] inputBytes = input.getBytes("UTF-16");

        return cipher.doFinal(inputBytes);
    }

    public String decrypt(byte[] encryptionBytes) throws Exception
    {
        cipher.init(Cipher.DECRYPT_MODE, key);
        byte[] recoveredBytes = cipher.doFinal(encryptionBytes);
        String recovered = new String(recoveredBytes, "UTF-16");

        return recovered;
    }
}

这是我尝试的测试:

public class NewEncrypterTest
{
    @Test
    public void canEncryptAndDecrypt() throws Exception
    {
        String toEncrypt = "FOOBAR";

        NewEncrypter encrypter = new NewEncrypter();

        byte[] encryptedByteArray = encrypter.encrypt(toEncrypt);
        System.out.println("encryptedByteArray:" + encryptedByteArray);

        String decoded = new String(encryptedByteArray, "UTF-16");
        System.out.println("decoded:" + decoded);

        byte[] encoded = decoded.getBytes("UTF-16");
        System.out.println("encoded:" + encoded);

        String decryptedText = encrypter.decrypt(encoded); //Exception here
        System.out.println("decryptedText:" + decryptedText);

        assertEquals(toEncrypt, decryptedText);
    }
}

推荐答案

将加密数据存储在字符串中不是一个好主意,因为它们用于人类可读的文本,而不是任意二进制数据.对于二进制数据,最好使用 byte[].

It is not a good idea to store encrypted data in Strings because they are for human-readable text, not for arbitrary binary data. For binary data it's best to use byte[].

但是,如果您必须这样做,您应该使用在字节和字符之间具有一对一映射的编码,即每个字节序列可以映射到唯一的字符序列,然后返回.其中一种编码是ISO-8859-1,即:

However, if you must do it you should use an encoding that has a 1-to-1 mapping between bytes and characters, that is, where every byte sequence can be mapped to a unique sequence of characters, and back. One such encoding is ISO-8859-1, that is:

    String decoded = new String(encryptedByteArray, "ISO-8859-1");
    System.out.println("decoded:" + decoded);

    byte[] encoded = decoded.getBytes("ISO-8859-1"); 
    System.out.println("encoded:" + java.util.Arrays.toString(encoded));

    String decryptedText = encrypter.decrypt(encoded);

其他不会丢失数据的常见编码是十六进制base64,但遗憾的是您需要一个辅助库来处理它们.标准 API 没有为它们定义类.

Other common encodings that don't lose data are hexadecimal and base64, but sadly you need a helper library for them. The standard API doesn't define classes for them.

使用 UTF-16 时程序会失败,原因有两个:

With UTF-16 the program would fail for two reasons:

  1. String.getBytes("UTF-16") 向输出添加一个字节顺序标记字符以标识字节的顺序.您应该使用 UTF-16LE 或 UTF-16BE 以免发生这种情况.
  2. 并非所有字节序列都可以映射到 UTF-16 中的字符.首先,以 UTF-16 编码的文本必须具有偶数个字节.其次,UTF-16 有一种机制可以对 U+FFFF 以外的 unicode 字符进行编码.这意味着例如有 4 个字节的序列仅映射到一个 unicode 字符.为此,4 个字节的前 2 个字节不使用 UTF-16 编码任何字符.

这篇关于将字节数组转换为字符串再转换回字节数组的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆