为什么Base64.de code生产用于不同的字符串相同的字节数组? [英] Why do Base64.decode produce same byte array for different strings?

查看:375
本文介绍了为什么Base64.de code生产用于不同的字符串相同的字节数组?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用的网址安全base64编码EN code我随机生成的字节数组。但我对解码的问题。当我去code两个不同的字符串(除了最后一个字符是相同的),它会产生相同的字节数组。例如,对于dGVzdCBzdHJpbmrdGVzdCBzdHJpbmq字符串的结果是一样的:

I'm using URL safe Base64 encoding to encode my randomly generated byte arrays. But I have a problem on decoding. When I decode two different strings (all but the last chars are identical), it produces the same byte array. For example, for both "dGVzdCBzdHJpbmr" and "dGVzdCBzdHJpbmq" strings the result is same:

Array(116, 101, 115, 116, 32, 115, 116, 114, 105, 110, 106)

有关编码/解码我用 java.util.Base64 以这种方式:

For encoding/decoding I use java.util.Base64 in that way:

// encoding...
Base64.getUrlEncoder().withoutPadding().encodeToString(myString.getBytes())
// decoding...
Base64.getUrlDecoder().decode(base64String)

什么是这种冲突的原因是什么?是它也可以与除最后一个其他字符?我怎么能解决这个问题,使解码返回不同的字节数组,每个不同的字符串?

What is the reason of this collision? Is it also possible with chars other than the last one? And how can I fix this and make decoding to return a different byte array for each different string?

推荐答案

您所看到的问题,是一个事实,即字节您在结果(11字节)拥有数不完全造成的补为Base64的最后一个字符EN codeD字符串。

The issue you are seeing, is caused by the fact that the number of bytes you have in the "result" (11 bytes) doesn't completely "fill" the last char of the Base64 encoded string.

记住的Base64 EN codeS每8位实体为6位字符。然后将得到的字符串需要确切11 * 8/6个字节,或14个字符的2/3。但你不能写部分字符。仅前4位(或最后一个字符的2/3)是显著。最后两位是不是去codeD。因此,所有的:

Remember that Base64 encodes each 8 bit entity into 6 bit chars. The resulting string then needs exactly 11 * 8 / 6 bytes, or 14 2/3 chars. But you can't write partial characters. Only the first 4 bits (or 2/3 of the last char) are significant. The last two bits are not decoded. Thus all of:

dGVzdCBzdHJpbmo
dGVzdCBzdHJpbmp
dGVzdCBzdHJpbmq
dGVzdCBzdHJpbmr

所有德code相同的11​​个字节( 116,101,115,116,32,115,116,114,105,110,106 ) 。

PS:如果没有填充,有的去codeRS会尽量去code中的最后一个字节,以及,你就会有一个12字节的结果(使用不同的最后一个字节)。这是我的意见的原因(询问是否 withoutPadding()选项是一个好主意)。但是,你去codeR似乎处理这个问题。

PS: Without padding, some decoders will try to decode the "last" byte as well, and you'll have a 12 byte result (with different last byte). This is the reason for my comment (asking if withoutPadding() option is a good idea). But your decoder seems to handle this.

这篇关于为什么Base64.de code生产用于不同的字符串相同的字节数组?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆