为什么base64编码需要填充,如果输入长度不是被3整除? [英] Why does base64 encoding requires padding if the input length is not divisible by 3?

查看:600
本文介绍了为什么base64编码需要填充,如果输入长度不是被3整除?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是填充在base64编码的目的。以下是从维基百科摘录:

What is the purpose of padding in base64 encoding. The following is the extract from wikipedia:

的附加填充字符被分配其可以用于向连接$ C $光盘输出强迫的4个字符的整数倍(或等效当unen $ C $光盘二进制文本不是3个字节的倍数) ;解码时这些填充字符然后必须被丢弃,但仍然允许有效长度的unen codeD文本的,当其输入二进制长度不会是3个字节(最后一个非垫字符不是倍数的计算通常烯$ C $光盘,使其重新最后6位的块presents将在其至少显著位零填充,至多两个填充字符可以在连接$ C $光盘流的末尾发生) 。

"An additional pad character is allocated which may be used to force the encoded output into an integer multiple of 4 characters (or equivalently when the unencoded binary text is not a multiple of 3 bytes) ; these padding characters must then be discarded when decoding but still allow the calculation of the effective length of the unencoded text, when its input binary length would not be not a multiple of 3 bytes (the last non-pad character is normally encoded so that the last 6-bit block it represents will be zero-padded on its least significant bits, at most two pad characters may occur at the end of the encoded stream)."

我写了一个程序可能采用base64 EN code任何字符串和DE code任何的base64 EN codeD字符串。什么问题呢填充解决了?

I wrote a program which could base64 encode any string and decode any base64 encoded string. What problem does padding solves?

推荐答案

您的结论,即填充是不必要的是对的。它总是可以从连接codeD序列的长度明确地确定了输入的长度。

Your conclusion that padding is unnecessary is right. It's always possible to determine the length of the input unambiguously from the length of the encoded sequence.

然而,填充是在BASE64烯$ C $光盘串以这样的方式,各个序列的长度被丢失,因为可能会发生,例如,在一个非常简单的网络协议被连接的情况下是有用的。

However, padding is useful in situations where base64 encoded strings are concatenated in such a way that the lengths of the individual sequences are lost, as might happen, for example, in a very simple network protocol.

如果护垫的字符串连,这是不可能的,以恢复原始数据,因为关于每个单独序列的末尾奇数字节数信息丢失。然而,如果使用填充序列,没有歧义,并且该序列作为一个整体可德$ C $光盘正确地

If unpadded strings are concatenated, it's impossible to recover the original data because information about the number of odd bytes at the end of each individual sequence is lost. However, if padded sequences are used, there's no ambiguity, and the sequence as a whole can be decoded correctly.

假设我们有一个程序,基于64位恩codeS的话,将其连接并通过网络发送。它连接codeS我,AM和TJM三明治在一起的结果没有填充并将它们传送。

Suppose we have a program that base64-encodes words, concatenates them and sends them over a network. It encodes "I", "AM" and "TJM", sandwiches the results together without padding and transmits them.


  • I 连接codeS到 SQ SQ == 与填充)

  • AM 连接codeS到 QU0 QU0 = 与填充)

  • TJM 连接codeS到 VEpN VEpN 与填充)

  • I encodes to SQ (SQ== with padding)
  • AM encodes to QU0 (QU0= with padding)
  • TJM encodes to VEpN (VEpN with padding)

因此​​,传输的数据是 SQQU0VEpN 。接收器的base64德codeS这是 I \\ X04 \\ X14 \\ xd1Q)而不是预期的 IAMTJM 。其结果是无稽之谈,因为发送者的有关的每个单词结束销毁信息的在连接codeD序列。如果发送方发送了 SQ == QU0 = VEpN 代替,后者具有德codeD这是这将串联给三个独立的base64序列 IAMTJM

So the transmitted data is SQQU0VEpN. The receiver base64-decodes this as I\x04\x14\xd1Q) instead of the intended IAMTJM. The result is nonsense because the sender has destroyed information about where each word ends in the encoded sequence. If the sender had sent SQ==QU0=VEpN instead, the receiver could have decoded this as three separate base64 sequences which would concatenate to give IAMTJM.

为什么不只是设计协议preFIX一个整数的长度每一个字?然后接收器能够正确去code中的流就没有必要进行填充。

Why not just design the protocol to prefix each word with an integer length? Then the receiver could decode the stream correctly and there would be no need for padding.

这是一个伟大的想法,只要我们的知道的我们,我们开始编码之前编码数据的长度。但是,如果,不是的话,我们从现场摄像头视频编码块?我们可能不会提前知道每个块的长度。

That's a great idea, as long as we know the length of the data we're encoding before we start encoding it. But what if, instead of words, we were encoding chunks of video from a live camera? We might not know the length of each chunk in advance.

如果使用的协议填充,就没有必要在所有的传输的长度。该数据可能是连接codeD,因为它从相机来到了填充终止每个块,和接收器将能够正确地去code中的数据流。

If the protocol used padding, there would be no need to transmit a length at all. The data could be encoded as it came in from the camera, each chunk terminated with padding, and the receiver would be able to decode the stream correctly.

显然,这是一个很做作的例子,但也许说明了为什么填充可能被认为在某些情况下有所帮助。

Obviously that's a very contrived example, but perhaps it illustrates why padding might conceivably be helpful in some situations.

这篇关于为什么base64编码需要填充,如果输入长度不是被3整除?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆