Java-My Huffman解压缩拒绝解压缩非文本文件(返回空文件) [英] Java - My Huffman decompression refuses to decompress non-text files (returns empty file)

查看:100
本文介绍了Java-My Huffman解压缩拒绝解压缩非文本文件(返回空文件)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我能够压缩所有类型的文件(.jpg,.mp4等),但是当我尝试对这些非文本文件进行解压缩时,程序只会返回一个空的解压缩文件...奇怪的是,我

I am able to compress all kinds of files (.jpg, .mp4 etc.) but when I try to decompress these non-text files the program just returns an empty decompressed file... the weird part is that I am able to decompress plain text files just fine.

当我压缩原始文件时,我将重建树所需的数据和编码位都放在同一文件中。格式看起来像这样:

When I compress my original file I put both the data needed to reconstruct the tree and the encoded bits in the same file. The format looks something like this:

<n><value 1><frequency 1>...<value n><frequency n>[the compressed bytes]

其中n是唯一的总数字节(又称树中的叶子数),值只是字节形式的叶子值,而频率是每个字节/字符的频率(频率是一个整数值,因此每个频率由4个字节组成)。

Where n is the total number of unique bytes (AKA number of leafs in my tree), value is just my leaf values in byte form and frequency is the frequency of each byte/"character" (frequency is an int value so it consists of 4 bytes per frequency).

我的代码中的BitFileReader和BitFileWriter只是BufferedOutStream / InputStream的包装类,具有逐点读取/写入的功能。

BitFileReader and BitFileWriter in my code are just wrapper classes for a BufferedOutStream/InputStream with the added functionality of reading/writing bit by bit.

我在下面添加了整个霍夫曼代码,但主要的重点是底部的compress()和decompress()方法。至少我想知道我对这些方法的逻辑是否很好,如果可以,是什么原因导致在解压缩其他文件类型(不是纯文本文件)时返回空的解压缩文件?

I added my whole Huffman code below but the main focus is the compress() and decompress() methods at the bottom. At the very least I want to know if my logic for these methods is fine, and if so what is causing it to return empty decompressed files when decompressing other file types (that aren't plain text files)?

霍夫曼代码:

public class HuffmanCode {


    public static Tree buildTree(int[] charFreqs) {
        PriorityQueue<Tree> trees = new PriorityQueue<Tree>();

        for (int i = 0; i < charFreqs.length; i++){
            if (charFreqs[i] > 0){
                trees.offer(new Leaf(charFreqs[i], i));
            }
        }

        //assert trees.size() > 0;

        while (trees.size() > 1) {
            Tree a = trees.poll();
            Tree b = trees.poll();

            trees.offer(new Node(a, b));
        }
        return trees.poll();
    }

    public static void printStruct(Tree tree) {
        //assert tree != null;
        if (tree instanceof Leaf) {
            Leaf leaf = (Leaf)tree;

            System.out.println(leaf.value + " " + leaf.frequency);

        } else if (tree instanceof Node) {
            Node node = (Node)tree;

            // traverse left
            printStruct(node.left);

            // traverse right
            printStruct(node.right);
        }
    }


    public static void printStruct(Tree tree, StringBuffer prefix) {
        //assert tree != null;
        if (tree instanceof Leaf) {
            Leaf leaf = (Leaf)tree;

            System.out.println(leaf.value + "\t" + leaf.frequency + "\t" + prefix);

        } else if (tree instanceof Node) {
            Node node = (Node)tree;

            // traverse left
            prefix.append('0');
            printStruct(node.left, prefix);
            prefix.deleteCharAt(prefix.length()-1);

            // traverse right
            prefix.append('1');
            printStruct(node.right, prefix);
            prefix.deleteCharAt(prefix.length()-1);
        }
    }

    public static void fillEncodeMap(Tree tree, StringBuffer prefix, TreeMap<Integer, String> treeMap) {
        //assert tree != null;
        if (tree instanceof Leaf) {
            Leaf leaf = (Leaf)tree;

            treeMap.put(leaf.value, prefix.toString());

        } else if (tree instanceof Node) {
            Node node = (Node)tree;

            // traverse left
            prefix.append('0');
            fillEncodeMap(node.left, prefix, treeMap);
            prefix.deleteCharAt(prefix.length()-1);

            // traverse right
            prefix.append('1');
            fillEncodeMap(node.right, prefix, treeMap);
            prefix.deleteCharAt(prefix.length()-1);
        }
    }

    public static void fillDecodeMap(Tree tree, StringBuffer prefix, TreeMap<String, Integer> treeMap) {
        //assert tree != null;
        if (tree instanceof Leaf) {
            Leaf leaf = (Leaf)tree;

            treeMap.put(prefix.toString(), leaf.value);

        } else if (tree instanceof Node) {
            Node node = (Node)tree;

            // traverse left
            prefix.append('0');
            fillDecodeMap(node.left, prefix, treeMap);
            prefix.deleteCharAt(prefix.length()-1);

            // traverse right
            prefix.append('1');
            fillDecodeMap(node.right, prefix, treeMap);
            prefix.deleteCharAt(prefix.length()-1);
        }
    }



    public static void compress(File file){
        try {
            Path path = Paths.get(file.getAbsolutePath());
            byte[] content = Files.readAllBytes(path);
            TreeMap<Integer, String> encodeMap = new TreeMap<Integer, String>();
            File nF = new File(file.getName()+"_comp");
            nF.createNewFile();
            BitFileWriter bfw = new BitFileWriter(nF);

            int[] charFreqs = new int[256];

            // read each byte and record the frequencies
            for (byte b : content){
                charFreqs[b&0xFF]++;
            }

            // build tree
            Tree tree = buildTree(charFreqs);

            // build TreeMap
            fillEncodeMap(tree, new StringBuffer(), encodeMap);

            // Writes tree structure in binary form to nF (new file)
            bfw.writeByte(encodeMap.size());
            for(int i=0; i<charFreqs.length; i++){
                if(charFreqs[i] != 0){
                    ByteBuffer b = ByteBuffer.allocate(4);
                    b.putInt(charFreqs[i]);
                    byte[] result = b.array();

                    bfw.writeByte(i);
                    for(int j=0; j<4;j++){
                        bfw.writeByte(result[j]&0xFF);
                    }
                }
            }

            // Write compressed data
            for(byte b : content){
                String code = encodeMap.get(b&0xFF);
                for(char c : code.toCharArray()){
                    if(c == '1'){
                        bfw.write(1);
                    }
                    else{
                        bfw.write(0);
                    }
                }
            }
            bfw.close();
            System.out.println("Compression successful!");

        } catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static void decompress(File file){
        try {
            BitFileReader bfr = new BitFileReader(file);
            int[] charFreqs = new int[256];
            TreeMap<String, Integer> decodeMap = new TreeMap<String, Integer>();
            File nF = new File(file.getName()+"_decomp");
            nF.createNewFile();
            BitFileWriter bfw = new BitFileWriter(nF);
            DataInputStream data = new DataInputStream(new BufferedInputStream(new FileInputStream(file)));

            int uniqueBytes;
            int counter = 0;
            int byteCount = 0;
            uniqueBytes = data.readUnsignedByte();

            // Read frequency table
            while (counter < uniqueBytes){
              int index = data.readUnsignedByte();
              int freq = data.readInt();
              charFreqs[index] = freq;
              counter++;
            }

            // build tree
            Tree tree = buildTree(charFreqs);

            // build TreeMap
            fillDecodeMap(tree, new StringBuffer(), decodeMap);

            // Skip BitFileReader position to actual compressed code
            bfr.skip(uniqueBytes*5);

            // Get total number of compressed bytes
            for(int i=0; i<charFreqs.length; i++){
                if(charFreqs[i] > 0){
                    byteCount += charFreqs[i];
                }
            }

            // Decompress data and write
            counter = 0;
            StringBuffer code = new StringBuffer();

            while(bfr.hasNextBit() && counter < byteCount){
                code.append(""+bfr.nextBit());

                if(decodeMap.containsKey(code.toString())){
                    bfw.writeByte(decodeMap.get(code.toString()));
                    code.setLength(0);
                    counter++;
                }
            }
            bfw.close();
            bfr.close();
            data.close();

            System.out.println("Decompression successful!");

        } 

        catch (IOException e) {
            e.printStackTrace();
        }
    }

    public static void main(String[] args) {
        File f = new File("test");
        compress(f);
        f = new File("test_comp");
        decompress(f);
    }
}

编辑:我找到了原因,但我没有知道如何解决它或为什么会发生。问题是我的解压缩方法中的charFreqs []数组永远不会被填充(根据数组,其所有值仍为零,也就是所有字节的频率均为零)。

I found the cause but I don't know how to fix it or why it happens. The problem is that my charFreqs[] array in my decompression method never gets filled (all its values are still at zero AKA all bytes have zero frequency according to the array).

推荐答案

我解决了!问题是我的 compress()方法中的 bfw.writeByte(encodeMap.size())行。它只会向文件中写入字节,但是 encodeMap.size()函数可以返回256(如果已满)。 256是一个比字节可以容纳的值高的值( bfw.writeByte()实际上接受一个int作为参数,但它仅写入int的最低8位,实际上仅字节可以保留的位,因此在某种程度上它实际上具有无符号字节的范围0-255)。

I solved it! The problem was the bfw.writeByte(encodeMap.size()) line in my compress() method. It would only write bytes to the file but the encodeMap.size() function can return a value of 256 if it is full. 256 is a higher value than a byte can hold (bfw.writeByte() actually takes an int as a argument but it only writes the 8 lowest bits of the int, essentially only the bits that a byte can hold, so in a way it actually has the range of an unsigned byte 0-255).

我通过更改两行代码解决了这一问题。我的 compress()方法中的行 bfw.writeByte(encodeMap.size())更改为 bfw.writeByte(encodeMap.size()-1)和我的<$ c中的行 uniqueBytes = data.readUnsignedByte() $ c> decompress()方法已更改为 uniqueBytes = data.readUnsignedByte()+ 1

I solved this by changing two lines of code. The line bfw.writeByte(encodeMap.size()) in my compress() method was changed to bfw.writeByte(encodeMap.size()-1) and the line uniqueBytes = data.readUnsignedByte() in my decompress() method was changed to uniqueBytes = data.readUnsignedByte() + 1

这篇关于Java-My Huffman解压缩拒绝解压缩非文本文件(返回空文件)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆