如何在jpeg文件中从FFC4(DHT)头创建霍夫曼树? [英] How to create Huffman tree from FFC4 (DHT) header in jpeg file?

查看:387
本文介绍了如何在jpeg文件中从FFC4(DHT)头创建霍夫曼树?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我以为我可以自己解决这个问题,但我似乎根本没有前进。

I thought I could work this one out myself but I don't seem to be moving forward at all.

好的,背景:

我需要根据jpg文件中FFC4,DHT(Define Huffman Table)标头提供的信息创建一个Huffman代码树。 DHT标头以这种方式定义Huffman表:

I need to create a Huffman tree of codes from the information provided by the FFC4, DHT (Define Huffman Table) header in a jpg file. The DHT header defines the Huffman table in this way:

1)一系列16字节。每个字节定义有多少符号具有n位的霍夫曼码,其中n是系列中字节的位置。 (这有什么意义吗?!!)例如十六进制的原始数据是:

1) A series of 16 bytes. Each byte defines how many symbols have a Huffman code of n amount of bits where n is the position of the byte in the series. (did that make any sense?!!) For example the raw data in hex is:


00 01 05 01 01 01 ... 00

00 01 05 01 01 01 ... 00

这意味着:

Num of bits:    1   2   3   4   5   6   7 ...  16  
Num of codes:   00  01  05  01  01  01  01 ... 00

2)16个字节的列表后面是实际符号本身。例如:

2) After the list of 16 bytes come the actual symbols themselves. For example:


00 01 02 03 04 05 06 07 08 09 0A

00 01 02 03 04 05 06 07 08 09 0A 0B

3)结合这两部分我们看到它们是:

00代码1位。

01代码2位:所以拿列表中的第一个符号:00

05代码3位:所以从列表中取下一个5个符号:01 02 03 04 05

..等等

3) Combining the two parts we see that their are:
00 codes with 1 bit.
01 codes with 2 bits: so take the first symbol from the list: 00
05 codes with 3 bits: so take the next 5 symbols from the list: 01 02 03 04 05
.. and so on

4)最后,我们必须根据上述信息计算出实际的霍夫曼代码。如果你是一个数学天才,你可能已经发现这些代码可以通过简单地为某个位长度的每个新代码增加一个具有适当位数的二进制数来计算出来。当位长度增加时,只需增加二进制数,然后将其加倍并继续。当你手工绘制出一些二叉树时,对其他人来说很明显...

4) Finally we have to work out the actual Huffman codes from the information above. If you're a mathematical genius, you may have spotted already that these codes can be worked out by simply incrementing a binary number with the appropriate number of bits for every new code at a certain bit length. When the bit length increases, simply increment the binary number and then double it and carry on. It becomes obvious to everyone else when you've drawn out a number of these binary trees by hand...

5)所以这就是我用来解决的代码霍夫曼编码并将它们存储在一个数组中:(首先我将数据读入1)并将其放入数组中:dhtBitnum)

5) So this is the code I used to work out the Huffman codes and store them in an array: (first I read the data at 1) and put it in an array: dhtBitnum)

            int binaryCode = 0;
            count = 0;
            StringBuffer codeString = new StringBuffer();               

            //populate array with code strings
            for (int numBits=1; numBits<=16; numBits++) {

                //dhtBitnum contains the number of codes that have a certain number of bits
                for (int i=1; i<=dhtBitnum[(numBits-1)]; i++) {

                    //turn the binary number into a string
                    codeString.append(Integer.toBinaryString(binaryCode)); 
                    //prepend 0s until the code string is the right length
                    for(int n=codeString.length(); n<numBits; n++) {
                        codeString.insert(0, "0");
                    }
                    //put the created Huffman code in an array
                    dhtCodes[count]=codeString.toString();
                    binaryCode++;
                    count ++;
                    codeString.delete(0, codeString.length());
                }
                binaryCode = binaryCode << 1;

            }

一旦我生成了霍夫曼代码并按顺序存储它们,我可以按顺序添加它们引用的符号,因为它们在2)中出现。这可能不是非常优雅,但它似乎至少起作用并创建了正确的表格。

Once I have generated the Huffman codes and stored them in order, I can just add the symbols that they refer to in order as they come along in 2). This may not be terribly elegant but it seems to work at least and creates the correct tables.

6)如果有人仍然关注这一点,你应该获得一枚奖牌。

6) If anyone is still following this, you deserve a medal.

7)现在问题是我想将这些信息存储在二叉树中,以便我可以在以后有效地解码jpg图像数据,而不是每次都搜索数组。不幸的是,我无法从上面jpg头文件中提供的信息中直接找到一个干净而有效的方法。

我能想到的唯一方法是将霍夫曼代码设计为首先,然后实现一些根据需要创建节点的方法,并使用代码作为排序地址将符号保存在适当的位置。然而,这似乎是一个圆形的方式,也是重复我需要的信息,我敢肯定必须有一个更好,更简单的方法。

7) Now the problem is I'd like to store this information in a binary tree so that I can efficiently decode the jpg image data later on, rather than searching through arrays everytime. Unfortunately I can't figure out a nice clean and efficient way to do this directly from the information provided in the jpg headers as above.
The only way I can think of is by working out the Huffman codes as above first, then implementing some method that creates nodes as needed and saves the symbols in the appropriate place, using the codes as an address of sorts. However this seems such a round about way that is also duplicating the information I need, I'm sure there must be a much better and simpler method.

8)所以如果任何人都理解我的谣言,我会非常感谢你的一些建议。我意识到这是一个非常具体的问题,但如果没有别的,上面的信息可能对某人有帮助。我仍然很新,所以借口我的无知,易于理解的代码特别受欢迎!

8) So if anyone understood my ramblings, I'd be very grateful for some suggestions. I realise this is a very specific problem, but if nothing else the info above might prove helpful to someone. I am still very new at this so excuse my ignorance, easy to understand code is especially welcome!

推荐答案

关于如何实施这直接我不完全确定,因为处理信息需要一段时间,但如果你知道尝试。从第7点看,你不是吗?

As to how to implement this directly I'm not entirely sure, as it takes a while to process the information, but the algorithm should be pretty straight forward if you know about tries. It seems from point 7 that you do not?

我将添加一个例子:

         °
        / \
       /   \
      °     °
     / \   / \
    a   b  c  d 

在这个简单的线索中,如果我们向左走,我们假设左边是0,右边是1.所以如果遇到模式' 00'这对应于a。模式'10'对应于c。通常使用霍夫曼树,trie将是不平衡的,即

In this simple trie, if we go left we'll assume left is a 0, right is a 1. So If you encounter the pattern '00' this corresponds with an a. The pattern '10' corresponds with a c. Usually with huffman trees the trie will be unbalanced, i.e.

     °
    / \
   a   °
      / \
     b   °
        / \
       c   d

在这种情况下,前缀代码0对应于a。代码111到'd'。

In this case, the prefix code '0' corresponds to an a. The code 111 to a 'd'.

这篇关于如何在jpeg文件中从FFC4(DHT)头创建霍夫曼树?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆