AES中输入和密文长度之间的关系 [英] Relation between input and ciphertext length in AES

查看:5096
本文介绍了AES中输入和密文长度之间的关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

最近在我的应用程序中开始使用加密,我发现自己对输入文本长度和它产生的密文之间的关系感到困惑。在应用加密之前,很容易确定数据库列大小。



两个问题:


  1. 我是否正确的假设这是由于我的输入的填充,以便它适合密码的要求?

  2. 有一种方法准确地预测密文的最大长度基于最大长度的输入?

并且为了奖励点:我应该存储密码base64编码在varchar,作为原始字节并将其存储在varbinary?在我的数据库中存储字节存在风险(我使用参数化查询,因此在理论上,意外破坏转义不应该是一个问题)。



TIA !



补充:我使用的密码是AES / Rijndael-256-这种关系在可用的算法之间有何差异?

padding 和链接模式,以及算法块大小(if它是一种块密码)。



一些加密算法是流密码,它们逐位加密数据)。它们中的大多数产生密钥相关的伪随机字节流,并且通过将该流与数据异或(解密是相同的)来执行加密。对于流密码,加密长度等于明文数据长度。



其他加密算法是块密码。块密码名义上加密固定长度的单个数据块。 AES是具有128位块(16字节)的块密码。注意,AES-256也使用128位块; 256是关键字长度,而不是块长度。 链接模式是关于如何将数据分割成几个这样的块(这不容易做到安全,但CBC模式是好的)。根据链接模式,数据可能需要一些填充,即在末尾添加一些额外的字节,使得长度适合于链接模式。



在CBC模式下,输入数据必须具有块长度的长度倍数,所以通常添加PKCS#5填充:如果块长度为 n ,则至少添加1个字节,最多添加 n ,使得总大小为是 n的倍数,并且最后添加的字节(可能所有字节)都具有数值 k ,其中 k 是添加的字节数。在解密时,足以查看最后解密的字节以恢复 k ,从而知道必须最终去除多少填充字节。



,如果输入数据具有长度d,则CBC模式和AES,假设PKCS#5填充,则加密长度为(d + 16)& 〜15 。我在这里使用C形符号;长度介于 d + 1 d + 16 之间,为16的倍数。



有一种称为CTR(称为计数器)的模式,其中块密码加密计数器的连续值,产生伪随机字节流。这有效地将块密码转换为流密码,因此长度为 d 的消息被加密为 d 字节。



警告:所有加密系统(包括流密码)和模式都需要额外的值,称为 IV (初始值)。每个消息应具有其IV,并且没有用相同密钥加密的两个消息将使用相同的IV。一些模式有额外的要求;特别是对于CBC和CTR,应使用密码强的伪随机数发生器随机和均匀地选择IV。 IV不是秘密,但必须被解密者知道。由于每个消息获得其自己的IV,通常需要将IV与加密消息一起编码。使用CBC或CTR,IV具有长度 n ,因此,对于AES,这是额外的16个字节。我不知道mcrypt与IV是什么,但在密码学上,IV必须在某一点被管理。



对于Base64,它是有益于传输二进制在纯文本媒体上的数据,但这对于正确的数据库不是必需的。此外,Base64将数据放大约33%,因此不应盲目应用。我想你最好避开Base64这里。


Having recently started using cryptography in my application, I find myself puzzled by the relationship between the input text length and the ciphertext it results in. Before applying crypto, it was easy to determine the database column size. Now, however, the column size varies slightly.

Two questions:

  1. Am I correct in assuming this is due to the padding of my input, so that it fits the cipher's requirments?
  2. Is there a way to accurately predict the maximum length of the ciphertext based on the maximum length of the input?

And for bonus points: should I be storing the ciphertext base64-encoded in a varchar, or keep it as raw bytes and storing them in a varbinary? Are there risks involved with storing the bytes in my database (I'm using parameterized queries, so in theory accidental breaking of the escaping should not be an issue) ?

TIA!

Supplemental: The cipher I'm using is AES/Rijndael-256 - does this relation vary between the algorithms available?

解决方案

The relation depends on the padding and the chaining modes you are using, and the algorithm block size (if it is a block cipher).

Some encryption algorithms are stream ciphers which encrypt data "bit by bit" (or "byte by byte"). Most of them produce a key-dependent stream of pseudo-random bytes, and encryption is performed by XORing that stream with the data (decryption is identical). With a stream cipher, the encrypted length is equal to the plain data length.

Other encryption algorithms are block ciphers. A block cipher, nominally, encrypts a single block of data of a fixed length. AES is a block cipher with 128-bit blocks (16 bytes). Note that AES-256 also uses 128-bit blocks; the "256" is about the key length, not the block length. The chaining mode is about how the data is to be split into several such blocks (this is not easy to do it securely, but CBC mode is fine). Depending on the chaining mode, the data may require some padding, i.e. a few extra bytes added at the end so that the length is appropriate for the chaining mode. The padding must be such that it can be unambiguously removed when decrypting.

With CBC mode, the input data must have a length multiple of the block length, so it is customary to add PKCS#5 padding: if the block length is n, then at least 1 byte is added, at most n, such that the total size is a multiple of n, and the last added bytes (possibly all of them) have numerical value k where k is the number of added bytes. Upon decryption, it suffices to look at the last decrypted byte to recover k and thus know how many padding bytes must be ultimately removed.

Hence, with CBC mode and AES, assuming PKCS#5 padding, if the input data has length d then the encrypted length is (d + 16) & ~15. I am using C-like notation here; in plain words, the length is between d+1 and d+16, and multiple of 16.

There is a mode called CTR (as "counter") in which the block cipher encrypts successive values of a counter, yielding a stream of pseudo-random bytes. This effectively turns the block cipher into a stream cipher, and thus a message of length d is encrypted into d bytes.

Warning: about all encryption systems (including stream ciphers) and modes require an extra value called the IV (Initial Value). Each message shall have its IV, and no two messages encrypted with the same key shall use the same IV. Some modes have extra requirements; in particular, for both CBC and CTR, the IV shall be selected randomly and uniformly with a cryptographically strong pseudo-random number generator. The IV is not secret, but must be known by the decrypter. Since each message gets its own IV, it is often needed to encode the IV along with the encrypted message. With CBC or CTR, the IV has length n, so, for AES, that's an extra 16 bytes. I do not know what mcrypt does with the IV, but, cryptographically speaking, the IV must be managed at some point.

As for Base64, it is good for transferring binary data over text-only media, but this should not be necessary for a proper database. Also, Base64 enlarges data by about 33%, so it should not be applied blindly. I think you are best avoiding Base64 here.

这篇关于AES中输入和密文长度之间的关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆