比BASE64字节数组更紧凑的表示形式? [英] A more compact representation than BASE64 for byte arrays?

查看:126
本文介绍了比BASE64字节数组更紧凑的表示形式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于调试,我经常发现将字节数组(例如,哈希密码)可视化为BASE64字符串很有用。

For debugging I often find it useful to visualize byte arrays (for example hashed passwords) as BASE64 strings.

        public override string ToString()
        {
            return Convert.ToBase64String(this.Hash);      
        }

但是对于大哈希(例如超过32个字节),BASE64编码会产生一个字符串那很长。

But for large hashes (say more than 32 bytes) BASE64 encoding produces a string that is quite long. This makes it hard to compare them quickly by just looking at them.

BASE64仅使用64个可打印字符。我想知道是否还有其他编码技术使用64个以上的字符(但仍仅使用可打印的字符)来减少表示32个字节所需的长度。在我看来,我们可以做出很大的改进,因为在我的键盘上,我已经看到94个易于区分的可打印键。

BASE64 only uses 64 printable characters. I wonder if there are other encoding techniques that use more than 64 characters (but still only printable characters) to reduce the length needed to represent 32 bytes. It seems to me that we can greatly improve since on my keyboard I already see 94 easily distinguishable printable keys.

当然,使人可以轻松比较字节数组不是BASE64最初打算的。但是无论如何,对不对? ;)

推荐答案

您可以使用 Ascii85 。维基百科指出:

You can use Ascii85. Wikipedia states:


Ascii85也称为Base85,是Paul E. Rutter为btoa开发的一种二进制文本编码形式。效用。通过使用五个ASCII字符表示四个字节的二进制数据(假设每个ASCII字符八个位,使编码大小比原始数据大⁄⁄₄),它比使用四个字符表示三个字节的uuencode或Base64效率更高。数据(¹⁄₃增加,假设每个ASCII字符八位)。

Ascii85, also called Base85, is a form of binary-to-text encoding developed by Paul E. Rutter for the btoa utility. By using five ASCII characters to represent four bytes of binary data (making the encoded size ¹⁄₄ larger than the original, assuming eight bits per ASCII character), it is more efficient than uuencode or Base64, which use four characters to represent three bytes of data (¹⁄₃ increase, assuming eight bits per ASCII character).

您会在 github 杰夫·阿特伍德,他在该代码的博客

You'll find a c# implementation on github which is written by Jeff Atwood and he accompanied that code with a post on his blog

由于您只需要编码器部分,因此我使用Jeff的代码作为开始并创建了一个实现仅包含编码部分:

As you only need the encoder part, I used Jeff's code as a start and created an implementation with only the encoding part:

class Ascii85
{

    private const int _asciiOffset = 33;
    private const int decodedBlockLength = 4;

    private byte[] _encodedBlock = new byte[5];
    private uint _tuple;

    /// <summary>
    /// Encodes binary data into a plaintext ASCII85 format string
    /// </summary>
    /// <param name="ba">binary data to encode</param>
    /// <returns>ASCII85 encoded string</returns>
    public string Encode(byte[] ba)
    {
        StringBuilder sb = new StringBuilder((int)(ba.Length * (_encodedBlock.Length / decodedBlockLength)));

        int count = 0;
        _tuple = 0;
        foreach (byte b in ba)
        {
            if (count >= decodedBlockLength - 1)
            {
                _tuple |= b;
                if (_tuple == 0)
                {
                    sb.Append('z');
                }
                else
                {
                    EncodeBlock(_encodedBlock.Length, sb);
                }
                _tuple = 0;
                count = 0;
            }
            else
            {
                _tuple |= (uint)(b << (24 - (count * 8)));
                count++;
            }
        }

        // if we have some bytes left over at the end..
        if (count > 0)
        {
            EncodeBlock(count + 1, sb);
        }

        return sb.ToString();
    }

    private void EncodeBlock(int count, StringBuilder sb)
    {
        for (int i = _encodedBlock.Length - 1; i >= 0; i--)
        {
            _encodedBlock[i] = (byte)((_tuple % 85) + _asciiOffset);
            _tuple /= 85;
        }

        for (int i = 0; i < count; i++)
        {
            sb.Append((char)_encodedBlock[i]);
        }

    }
}

是必需的属性:

/// <summary>
/// adapted from the Jeff Atwood code to only have the encoder
/// 
/// C# implementation of ASCII85 encoding. 
/// Based on C code from http://www.stillhq.com/cgi-bin/cvsweb/ascii85/
/// </summary>
/// <remarks>
/// Jeff Atwood
/// http://www.codinghorror.com/blog/archives/000410.html
/// </remarks>

这篇关于比BASE64字节数组更紧凑的表示形式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆