ASCII COM pressor短的测试文件,而长 [英] ASCII compressor works for short test file, not on long

查看:142
本文介绍了ASCII COM pressor短的测试文件,而长的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在系统编程目前的项目是要拿出一个ASCII COM pressor,消除顶部零位和内容写入文件中。

The current project in Systems Programming is to come up with an ASCII compressor that removes the top zero bit and writes the contents to the file.

为了便于DECOM pression,原始文件大小被写入到文件,则玉米pressed炭字节。有两个文件以运行导通一个是63字节长的测试,而另一个是5344213字节。如预期的第一个测试文件下的作品我的code,因为它写56个字节的COM pressed文字加上4个字节的文件头。

In order to facilitate decompression, the original file size is written to file, then the compressed char bytes. There are two files to run tests on- one that is 63 bytes long, and the other is 5344213 bytes. My code below works as expected for the first test file, as it writes 56 bytes of compressed text plus 4 bytes of file header.

然而,当我尝试在漫长的测试文件中,COM pressed版本比原来的,更短的3个字节时,它应大致749KiB较小,或原始大小的14%。我已经制定了二进制位移值的长测试文件的前两个写循环,他们匹配的是什么记录在我的测试打印输出。

However, when I try it on the long test file, the compressed version is 3 bytes shorter than the original, when it should be roughly 749KiB smaller, or 14% of original size. I've worked out the binary bit shift values for the first two write loops of the long test file, and they match up what is being recorded on my test printout.

while ( (characters= read(openReadFile, unpacked, BUFFER)) >0 ){
   unsigned char packed[7]; //compression storage
   int i, j, k, writeCount, endLength, endLoop;

    //loop through the buffer array
    for (i=0; i< characters-1; i++){
        j= i%7; 

        //fill up the compressed array
        packed[j]= packer(unpacked[i], unpacked[i+1], j);

        if (j == 6){
            writeCalls++; //track how many calls made

            writeCount= write(openWriteFile, packed, sizeof (packed));
            int packedSize= writeCount;
            for (k=0; k<7 && writeCalls < 10; k++)
                printf("%X ", (int)packed[k]);      

            totalWrittenBytes+= packedSize;
            printf(" %d\n", packedSize);
            memset(&packed[0], 0, sizeof(packed)); //clear array

            if (writeCount < 0)
                printOpenErrors(writeCount);
        }
        //end of buffer array loop
        endLength= characters-i;
        if (endLength < 7){

            for (endLoop=0; endLoop < endLength-1; endLoop++){
                packed[endLoop]= packer(unpacked[endLoop], unpacked[endLoop+1], endLoop);
            }

            packed[endLength]= calcEndBits(endLength, unpacked[endLength]);
        }
    } //end buffer array loop
} //end file read loop

封隔器功能:

//calculates the compressed byte value for the array
char packer(char i, char j, int k){
    char packStyle;

    switch(k){
        //shift bits based on mod value with 8
        case 0:
                packStyle= ((i & 0x7F) << 1) | ((j & 0x40) >> 6);
            break;
        case 1:
            packStyle= ((i & 0x3F) << 2) | ((j & 0x60) >> 5);
            break;
        case 2:
            packStyle= ((i & 0x1F) << 3) | ((j & 0x70) >> 4);
            break;
        case 3:
            packStyle= ((i & 0x0F) << 4) | ((j & 0x78) >> 3);
            break;
        case 4:
            packStyle= ((i & 0x07) << 5) | ((j & 0x7C) >> 2);
            break;
        case 5:
            packStyle= ((i & 0x03) << 6) | ((j & 0x7E) >> 1);
            break;
        case 6:
            packStyle= ( (i & 0x01 << 7) | (j & 0x7F));
            break;
    }

    return packStyle;
}

我已经验证,有每一个打包的缓冲刷新时间写出来7个字节,并且还有为长文件进行写入763458来电,匹配到5344206字节,这写的。

I've verified that there are 7 bytes written out every time the packed buffer is flushed, and there are 763458 write calls made for the long file, which match up to 5344206 bytes written.

我是从我在二进制计算出事先打印输出得到相同的十六进制codeS,我可以看到每一个字节的最高位,清除。那么,为什么不在位移位被反映在结果?

I'm getting the same hex codes from the printout that I worked out in binary beforehand, and I can see the top bit of every byte removed. So why aren't the bit shifts being reflected in the results?

推荐答案

好吧,因为这是功课我只给你不给了一个解决一些提示。

Ok, since this is homework I'll just give you a few hints without giving out a solution.

首先你确定你得到的第一个文件的56个字节是正确字节?当然计数看起来不错,但你算很幸运(证明是第二个测试文件)。我可以马上看到在code至少有两个关键的错误。

First are you sure that the 56 bytes you get on the first file are the right bytes? Sure the count looks good, but you got lucky on count (proof is the second test file). I can immediately see at least two key mistakes in the code.

要确保你有合适的输出,字节数是不够的。你需要深入挖掘。如何有关检查字节本身逐个。 63个字符不算多去呵?有很多方法可以做到这一点。你可以使用 OD (A pretty不错的Linux / Unix工具来看待文件的二进制内容,如果你在Windows上使用了一些十六进制编辑器)。或者你可以从你的程序中打印出调试信息。

To make sure you have the right output, the byte count is not enough. You need to dig deeper. How about checking the bytes themselves one by one. 63 characters is not that much to go heh? There are many ways you can do this. You could use od (a pretty good Linux/Unix tool to look at the binary contents of files, if you're on Windows use some Hex editor). Or you could print out debug information from within your program.

祝你好运。

这篇关于ASCII COM pressor短的测试文件,而长的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆