霍夫曼编码/解码 [英] Huffman Encoding / Decoding

查看:90
本文介绍了霍夫曼编码/解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

另一个线程太长了所以决定开始一个新的。

让我的huffman树工作,我现在正在编码和解码我给定的文本文件。

我现在的想法是将所有叶节点打印成一个字符串并使用标记将它们拆分以获取我的ascii值和霍夫曼代码并存储到集合中以供编码时参考。


刚刚对此提出了一些问题:


1.无论如何都要查看huffman树中的ascii值,因为它们现在都被卡在一个节点中?


2.我正在考虑将代码放入地图中,将霍夫曼代码作为关键字,将ascii值作为地图值,因为它将按霍夫曼代码进行排序。也由出现的ascii值的频率决定。因此,需要更少的时间来遍历整个地图。这会有效还是有更好的方法?


谢谢:)

The other thread got too long so decided to start a new one.
Got my huffman tree working and am now moving onto encoding and decoding my given text file.

My idea now is to print out all the leaf nodes into a string and split them up using tokens to get my ascii value and the huffman code and store into a collection for reference when encoding.

Just got some questions on this:

1. Is there anyway to look through the huffman tree for the ascii value since they are all now stuck in one node?

2. I was thinking of putting the codes into a map with the huffman code as the key and the ascii value as the map value since it will be sorted by the huffman code which is also determined by the frequency of the ascii value appearing. Thus less time is need to traverse the whole map. Would this be efficient or is there a better way?

Thanks :)

推荐答案


另一个线程太长了所以决定开始一个新的。

让我的huffman树工作,我现在正在编码和解码我给定的文本文件。


我现在的想法是将所有叶节点打印成一个字符串并使用标记将它们拆分以获取我的ascii值和霍夫曼代码并存储到集合中以供编码时参考。


刚刚对此提出了一些问题:


1.无论如何都要通过huffman树查看ascii值,因为它们现在都被卡在了一个节点?


2.我正在考虑将代码放入地图中,将霍夫曼代码作为关键字,将ascii值作为地图值,因为它将按霍夫曼代码进行排序这也取决于出现的ascii值的频率。因此,需要更少的时间来遍历整个地图。这会有效还是有更好的方法?


谢谢:)
The other thread got too long so decided to start a new one.
Got my huffman tree working and am now moving onto encoding and decoding my given text file.

My idea now is to print out all the leaf nodes into a string and split them up using tokens to get my ascii value and the huffman code and store into a collection for reference when encoding.

Just got some questions on this:

1. Is there anyway to look through the huffman tree for the ascii value since they are all now stuck in one node?

2. I was thinking of putting the codes into a map with the huffman code as the key and the ascii value as the map value since it will be sorted by the huffman code which is also determined by the frequency of the ascii value appearing. Thus less time is need to traverse the whole map. Would this be efficient or is there a better way?

Thanks :)



我不知道为什么你''谈论ASCII值。叶节点表示与霍夫曼代码相关联的

字符(* any * Unicode字符)。


字符的霍夫曼代码只是一个比特序列。不打印输出的字符,而是打印该位序列。这就是

霍夫曼编码的本质;它是关于数据压缩的全部。


您需要从这些Unicode字符映射到压缩或编码部分的位序列。

压缩或编码部分。 />

解压缩这样的文件时,你应该读取位序列中的位,并检查你是否仍然拥有与Unicode相关的位序列
字符。


对于零位读取,你采取左孩子的路线;当1位是

读取时向右走。到达叶子节点的那一刻,你可以发出相关的角色。


亲切的问候,


Jos

I don''t know why you''re talking about ASCII values. The leaf nodes represent a
character (*any* Unicode character) which is associated with a Huffman code.

A Huffman code for a character simply is a sequence of bits. Instead of printing
the character to the output, you print that bit sequence. That''s the essence of
Huffman encoding; it''s all about data compression.

You need a mapping from those Unicode characters to the bit sequences for
the compression or encoding part.

When you decompress such a file you should read bits from a bit sequence and
check whether or not you still have that bit sequence associated with a Unicode
character.

For a zero bit read, you take the route of a left child; go right when a 1 bit is
read. The moment you reach a leaf node you can emit the associated character.

kind regards,

Jos


关于ascii部分,赋值只需要编码一个文本文件,这就是为什么我使用ascii值。另外要求是将霍夫曼代码打印为字符串而不是位(不太清楚为什么也是这样)。


仍然不太确定你的意思
About the ascii part, the assignment just requires the encoding of a text file thats why I use ascii value. Also the requirement is to print the huffman code as string instead of bits(not too sure why also).

Still not too sure about what you mean by

对于零位读取,您采用左孩子的路线;当1位是

读取时向右走。到达叶节点的那一刻,您可以发出相关的角色。
For a zero bit read, you take the route of a left child; go right when a 1 bit is
read. The moment you reach a leaf node you can emit the associated character.



现在我刚刚创建了一个带有ascii值和huffman代码表示的huffmancode对象,并将它们全部放入一个首先使用最短代码进行索引的数组中。因此,对于压缩,我只是搜索ascii值并获得它的霍夫曼代码。


考虑以同样的方式进行,但我猜它会非常低效...

For now I just created a huffmancode object with the ascii value and its huffman code representation and put them all into an array sorrting with the shortest code first. Thus for compression I just search the ascii value and get the huffman code for it.

Was thinking of doing it the same way but I guess its gonna be very inefficient...



关于ascii部分,赋值只需要编码一个文本文件,这就是为什么我使用ascii值。另外要求是将霍夫曼代码打印为字符串而不是位(不太确定为什么)。


仍然不太确定你的意思

对于零位读取,您采用左子的路线;当1位是

读取时向右走。到达叶子节点的那一刻,你可以发出相关的角色。

现在我只是用ascii值和它的霍夫曼代码表示创建了一个huffmancode对象,并将它们全部放入一个最短的数组中。代码第一。因此,对于压缩,我只搜索ascii值并获取它的霍夫曼代码。
For a zero bit read, you take the route of a left child; go right when a 1 bit is
read. The moment you reach a leaf node you can emit the associated character.
For now I just created a huffmancode object with the ascii value and its huffman code representation and put them all into an array sorrting with the shortest code first. Thus for compression I just search the ascii value and get the huffman code for it.



考虑以同样的方式做这件事,但我猜它会效率很低......

Was thinking of doing it the same way but I guess its gonna be very inefficient...



假设你有以下小霍夫曼树:

Suppose you have the following little Huffman tree:

展开 | 选择 | Wrap | 行号


这篇关于霍夫曼编码/解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆