如何为霍夫曼编码和解码创建树? [英] How can I create a tree for Huffman encoding and decoding?

查看:111
本文介绍了如何为霍夫曼编码和解码创建树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于我的任务,我要对霍夫曼树进行编码和解码。我在创建树时遇到问题,并且卡住了。

For my assignment, I am to do a encode and decode for huffman trees. I have a problem creating my tree, and I am stuck.

不要介意打印语句-它们只是供我测试,看看何时输出我的函数运行。

Don't mind the print statements - they are just for me to test and see what the output is when my function runs.

对于第一个for循环,我从用于测试的主块中的文本文件中获取了所有值和索引。

For the first for loop, I got all the values and index from the text file I used in my main block for testing.

在第二个for循环中,我将所有内容都插入了优先级队列。

In the second for loop I inserted all the stuff into the priority queue.

我对下一步该怎么做感到很困惑-我正在尝试制作节点,但是我对如何前进感到困惑。有人可以告诉我我是否这样做正确?

I am so stuck about where to go next - I'm trying to make nodes, but I am confused about how to progress. Can someone tell me if I'm doing this right?

def _create_code(self, frequencies):
    '''(HuffmanCoder, sequence(int)) -> NoneType
    iterate over index into the sequence keeping it 256 elements long, '''
    #fix docstring
    p = PriorityQueue()
    print frequencies

    index = 0 
    for value in frequencies:
        if value != 0:
            print value #priority
            print index #elm
            print '-----------'       
        index = index + 1


    for i in range(len(frequencies)):
        if frequencies[i] != 0:
            p.insert(i, frequencies[i])  
            print i,frequencies[i]
            if p.is_empty():
                a = p.get_min()
                b = p.get_min()
                n1 = self.HuffmanNode(None, None, a)
                n2 = self.HuffmanNode(None, None, b)
                print a, b, n1, n2
    while not p.is_empty():
        p.get_min()

我手动插入前两个以启动我的树,是正确吗?

I manually inserted the first two to start my tree, is that correct?

我如何继续前进?我知道它的想法,只是代码方面,我非常受困扰。

How do I keep going? I know the idea of it, just code-wise I am very stuck.

这是使用python的方式。我尝试查看Wikipedia,我知道这些步骤,我只需要代码方面的帮助以及我应该如何继续前进,谢谢!

This is using python by the way. I tried looking at Wikipedia, I know the steps, I just need help on code and how I should keep going, thanks!

HuffmanNode来自此嵌套类:

The HuffmanNode comes from this nested class:

class HuffmanNode(object):

    def __init__(self, left=None, right=None, root=None):
        self.left = left
        self.right = right
        self.root = root


推荐答案

另一个解决方案,返回字典 {label:code} 和递归字典 tree 包含结果图。输入 vals 采用字典 {label:freq} 的形式:

One more solution returning a dictionary {label:code} and a recursive dictionary tree containing the resulting graph. The input vals is in form of dictionary {label:freq}:

def assign_code(nodes, label, result, prefix = ''):    
    childs = nodes[label]     
    tree = {}
    if len(childs) == 2:
        tree['0'] = assign_code(nodes, childs[0], result, prefix+'0')
        tree['1'] = assign_code(nodes, childs[1], result, prefix+'1')     
        return tree
    else:
        result[label] = prefix
        return label

def Huffman_code(_vals):    
    vals = _vals.copy()
    nodes = {}
    for n in vals.keys(): # leafs initialization
        nodes[n] = []

    while len(vals) > 1: # binary tree creation
        s_vals = sorted(vals.items(), key=lambda x:x[1]) 
        a1 = s_vals[0][0]
        a2 = s_vals[1][0]
        vals[a1+a2] = vals.pop(a1) + vals.pop(a2)
        nodes[a1+a2] = [a1, a2]        
    code = {}
    root = a1+a2
    tree = {}
    tree = assign_code(nodes, root, code)   # assignment of the code for the given binary tree      
    return code, tree

这可以用作:

freq = [
(8.167, 'a'), (1.492, 'b'), (2.782, 'c'), (4.253, 'd'),
(12.702, 'e'),(2.228, 'f'), (2.015, 'g'), (6.094, 'h'),
(6.966, 'i'), (0.153, 'j'), (0.747, 'k'), (4.025, 'l'),
(2.406, 'm'), (6.749, 'n'), (7.507, 'o'), (1.929, 'p'), 
(0.095, 'q'), (5.987, 'r'), (6.327, 's'), (9.056, 't'), 
(2.758, 'u'), (1.037, 'v'), (2.365, 'w'), (0.150, 'x'),
(1.974, 'y'), (0.074, 'z') ]    
vals = {l:v for (v,l) in freq}
code, tree = Huffman_code(vals)

text = 'hello' # text to encode
encoded = ''.join([code[t] for t in text])
print('Encoded text:',encoded)

decoded = []
i = 0
while i < len(encoded): # decoding using the binary graph
    ch = encoded[i]  
    act = tree[ch]
    while not isinstance(act, str):
        i += 1
        ch = encoded[i]  
        act = act[ch]        
    decoded.append(act)          
    i += 1

print('Decoded text:',''.join(decoded))

一个人可以使用Graphviz将树可视化为:

One can visualize the tree with Graphviz as:

该图形是由以下脚本生成的(需要Graphviz):

The figure was generated by the following script as (Graphviz is needed):

def draw_tree(tree, prefix = ''):    
    if isinstance(tree, str):            
        descr = 'N%s [label="%s:%s", fontcolor=blue, fontsize=16, width=2, shape=box];\n'%(prefix, tree, prefix)
    else: # Node description
        descr = 'N%s [label="%s"];\n'%(prefix, prefix)
        for child in tree.keys():
            descr += draw_tree(tree[child], prefix = prefix+child)
            descr += 'N%s -> N%s;\n'%(prefix,prefix+child)
    return descr

import subprocess
with open('graph.dot','w') as f:
    f.write('digraph G {\n')
    f.write(draw_tree(tree))
    f.write('}') 
subprocess.call('dot -Tpng graph.dot -o graph.png', shell=True)

这篇关于如何为霍夫曼编码和解码创建树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆