哪种数据类型对于任何类型的文件的CRC16来说都更好 [英] Which datatype is better in calculation of CRC16 for any type of file

查看:479
本文介绍了哪种数据类型对于任何类型的文件的CRC16来说都更好的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里我使用两种不同的函数来计算任何类型的文件(。txt,.tar,.tar.gz,.bin,.scr,.sh等)的CRC16。 $ c>和不同的大小也从 1 KB到5 GB 不同。



我想实现这个



 `跨平台

减少耗时

必须适合任何文件类型和任何大小`

我在两个函数中都有相同的CRC值。但是任何一个人都可以告诉我哪一种更适合于不同平台上任何大小的任何类型的文件来计算CRC16。



这里我们要考虑0到255所有类型的字符。



任何身体可以建议我哪一个在我的要求是好的。



两个功能:



第一个在 readChar int 数据类型c>这里我使用 int readChar

  int CRC16_int(const char * filePath){

//声明变量来存储CRC结果。
无符号短结果;
//声明循环变量。
int intInnerLoopIndex;
result = 0xffff; //初始化结果变量来执行CRC校验和计算。

//存储从文件中读取的消息。
// char content [2000000];

//创建文件指针以打开和读取文件。
FILE * readFile;

//用于从文件读取字符。
int readChar;

//打开一个文件读
readFile = fopen(filePath,rb);

//检查文件是否可以打开或存在。
if(!readFile){
fputs(无法打开文件%s,stderr);
}
/ *
这里读取文件并存储到变量中。
* /
int chCnt = 0;
while((readChar = getc(readFile))!= EOF){

// printf(charcater is%c\\\
,readChar);
// printf(charcater is%c and int is%d \\\
,readChar,readChar);
result ^ =(short)(readChar); (intInnerLoopIndex = 0; intInnerLoopIndex< 8; intInnerLoopIndex ++){
if((result& 0x0001)== 0x0001){
result = result>> 1; //执行位移。
result = result ^ 0xa001; //对结果执行XOR运算。
} else {
result = result>> 1; //执行位移。
}
}

// content [chCnt] = readChar;
chCnt ++;
}
printf(\\\
CRC文件中的数据长度:%d,chCnt);
//这是提供的消息的最终CRC值。
return(result);
}

第二个是 unsigned char 数据类型 readChar 这里我使用 unsigned char readChar

  int CRC16_unchar(const char * filePath){

unsigned int filesize;
//声明变量来存储CRC结果。
无符号短结果;
//声明循环变量。
unsigned int intOuterLoopIndex,intInnerLoopIndex;
result = 0xffff; //初始化结果变量来执行CRC校验和计算。
FILE * readFile;
//用于从文件读取字符。
//问题是如果从十六进制值的文件(例如)0xfe,
//读取一个字节,那么char值将为-2,而unsigned char值将为254。
//这将显着影响您的CRC
unsigned char readChar;
//打开一个文件读
readFile = fopen(filePath,rb);
//检查文件是否可以打开或存在。
if(!readFile){
fputs(无法打开文件%s,stderr);
}
fseek(readFile,0,SEEK_END); //寻求结束文件
filesize = ftell(readFile); //获取当前文件指针
fseek(readFile,0,SEEK_SET); //寻找文件开始
/ *
这里读取文件并存储到变量中。
* /
int chCnt = 0; (intOuterLoopIndex = 0; intOuterLoopIndex< filesize; intOuterLoopIndex ++)

{
readChar = getc(readFile);
printf(charcater is%c and int is%d\\\
,readChar,readChar);

result ^ =(short)(readChar); (intInnerLoopIndex = 0; intInnerLoopIndex< 8; intInnerLoopIndex ++){
if((result& 0x0001)== 0x0001){
result = result>> 1; //执行位移。
result = result ^ 0xa001; //对
执行XOR操作else {
result = result>> 1; //执行位移。
}
}
chCnt ++;
}
printf(\\\
CRC文件中的数据长度:%d,chCnt);
return(result);
}

请帮我找出这个问题



谢谢

解决方案

不要在相同的功能中进行文件读取(或任何来源)和CRC计算。这是坏的设计。文件读取通常不是完全独立的平台(尽管POSIX是您最好的朋友),但CRC计算可以独立完成非常平台。另外,您可能希望将您的CRC算法重用于其他类型的数据源,这些数据源不能通过 fopen()访问。



为了给你一个提示,我总是放在我的项目中的CRC函数有这个原型:

  uint16_t Crc16 const uint8_t * buffer,size_t size,
uint16_t polynomial,uint16_t crc);

您不必一次调用该功能,并为其提供文件的完整内容。相反,您可以通过块循环遍历文件,并调用每个块的功能。您的案例中的多项式参数是 0xA001 (这是BTW是反转形式的多项式) code> crc 参数设置为 0xFFFF 第一次。每个随后的时间,您调用函数,将函数的上一个返回值传递给 crc 参数。



你的第二个代码段( CRC16_unchar )首先确定文件大小,然后读取该字节数。不要这样做,它不必要限制你处理最大4GB的文件(在大多数情况下)。只需阅读,直到EOF是更清洁的IMHO。



此外,我看到你正在努力与有符号/无符号字节。知道




  • printf 不知道你是否通过签名或无符号整数。你告诉 printf 与'%d'或'%u'如何解释整数。

  • 即使在C本身也几乎有符号和无符号整数之间的差异。如果您执行 int8_t x = 255 ,C将不会将255的值更改为-1。



有关C何时使用整数的签名的更多详细信息,请参阅此推送器:整数的签名何时重要?经验法则:总是使用 uint8_t 来处理原始字节。



所以这两个函数对于签名/整数都很好尺寸。



编辑:正如其他用户在答案中指出的那样,以字节为单位读取文件:

  uint16_t CRC16_int(const char * filePath){
FILE * readFile;
const uint8_t buf [1024];
size_t len;
uint16_t result = 0xffff ;;

/ *打开一个文件进行阅读。 * /
readFile = fopen(filePath,rb);
if(readFile == NULL){
exit(1);
}

/ *读取到EOF。 * /
while((len = fread(buf,sizeof(buf),1,readFile))> 0){
result = Crc16(buf,len,0xA001,result);
}

/ * readFile可能处于错误状态,请检查它与ferror()或feof()函数。 * /

返回结果;
}

此外,您应该更改您的函数原型,使其有可能返回错误,例如:

  //成功返回true,错误为false。 CRC存储在结果中。 
bool CRC16_int(const char * filePath,uint16_t * result)


Here i am using two different functions for calculating CRC16 for any type of file (.txt,.tar,.tar.gz,.bin,.scr,.sh etc) and different size also varies from 1 KB to 5 GB.

I want to achieve this

   `cross platform 

   less time consuming

   Have to work proper for any type of file and any size`

i got same value of CRC in both functions. but any one can tell me which one is more better to calculate CRC16 for any type of file with any size on different different platform.

Here we have to consider 0 to 255 all type characters.

Can any body please suggest me which one is good in my requirements.

Code of both functions :

First one which has int datatype in readChar here i am using int readChar

int CRC16_int(const char* filePath) {

    //Declare variable to store CRC result.
    unsigned short result;
    //Declare loop variables.
    int intInnerLoopIndex;
    result = 0xffff; //initialize result variable to perform CRC checksum calculation.

    //Store message which read from file.
    //char content[2000000];

    //Create file pointer to open and read file.
    FILE *readFile;

    //Use to read character from file.
    int readChar;

    //open a file for Reading
    readFile = fopen(filePath, "rb");

    //Checking file is able to open or exists.
    if (!readFile) {
        fputs("Unable to open file %s", stderr);
    }
    /*
     Here reading file and store into variable.
     */
    int chCnt = 0;
    while ((readChar = getc(readFile)) != EOF) {

        //printf("charcater is %c\n",readChar);
        //printf("charcater is %c and int is %d \n",readChar,readChar);
        result ^= (short) (readChar);
        for (intInnerLoopIndex = 0; intInnerLoopIndex < 8; intInnerLoopIndex++) {
            if ((result & 0x0001) == 0x0001) {
                result = result >> 1; //Perform bit shifting.
                result = result ^ 0xa001; //Perform XOR operation on result.
            } else {
                result = result >> 1; //Perform bit shifting.
            }
        }

        //content[chCnt] = readChar;
        chCnt++;
    }
    printf("\nCRC data length in file: %d", chCnt);
    //This is final CRC value for provided message.
    return (result);
}

Second one is unsigned char datatype of readChar Here i am using unsigned char readChar

int CRC16_unchar(const char* filePath) {

    unsigned int filesize;
    //Declare variable to store CRC result.
    unsigned short result;
    //Declare loop variables.
    unsigned int intOuterLoopIndex, intInnerLoopIndex;
    result = 0xffff; //initialize result variable to perform CRC checksum calculation.
    FILE *readFile;
    //Use to read character from file.
    //The problem is if you read a byte from a file with the hex value (for example) 0xfe, 
    //then the char value will be -2 while the unsigned char value will be 254. 
    //This will significantly affect your CRC 
    unsigned char readChar;
    //open a file for Reading
    readFile = fopen(filePath, "rb");
    //Checking file is able to open or exists.
    if (!readFile) {
        fputs("Unable to open file %s", stderr);
    }
    fseek(readFile, 0, SEEK_END); // seek to end of file
    filesize = ftell(readFile); // get current file pointer
    fseek(readFile, 0, SEEK_SET); // seek back to beginning of file
    /*
     Here reading file and store into variable.
     */
    int chCnt = 0;

    for (intOuterLoopIndex = 0; intOuterLoopIndex < filesize; intOuterLoopIndex++) {
        readChar = getc(readFile);
        printf("charcater is %c and int is %d\n",readChar,readChar);

                result ^= (short) (readChar);
                for (intInnerLoopIndex = 0; intInnerLoopIndex < 8; intInnerLoopIndex++) {
                    if ((result & 0x0001) == 0x0001) {
                        result = result >> 1; //Perform bit shifting.
                        result = result ^ 0xa001; //Perform XOR operation on 
                    } else {
                        result = result >> 1; //Perform bit shifting.
                    }
                }
                chCnt++;
    }
    printf("\nCRC data length in file: %d", chCnt);
    return (result);
}

Please Help me to figure out this problem

Thanks

解决方案

First things first. Don't do file reading (or whatever the source is) and CRC calculating in the same function. This is bad design. File reading is typically not completely platform independent (although POSIX is your best friend), but CRC calculation can be done very platform independently. Also you might want to reuse your CRC algorithm for other kind of data sources which aren't accessed with fopen().

To give you a hint, the CRC function I always drop in to my projects has this prototype:

uint16_t Crc16(const uint8_t* buffer, size_t size, 
                            uint16_t polynomial, uint16_t crc);

You don't have to call the function once and feed it the complete contents of the file. Instead you can loop through the file in blocks and call the function for each block. The polynomial argument in your case is 0xA001 (which is BTW a polynomial in 'reversed' form), and the crc argument is set to 0xFFFF the first time. Each subsequent time you call the function you pass the previous return value of the function to the crc argument.

In your second code frament (CRC16_unchar) you first determine the filesize and then read that number of bytes. Don't do that, it unnecessary limits you to handle files of maximum 4GB (in the most cases). Just reading until EOF is cleaner IMHO.

Furthermore I see that you are struggling with signed/unsigned bytes. Do know that

  • printf doesn't know if you pass an signed or unsigned integer. You tell printf with '%d' or '%u' how to interpret the integer.
  • Even in C itself there is hardly a difference between a signed and unsigned integer. C won't magically change the value of 255 to -1 if you do int8_t x = 255.

See this anser for more details about when C uses the signedness of an integer: When does the signedness of an integer really matter?. Rule of thumb: Just always use uint8_t for handling raw bytes.

So both functions are fine regarding signedness/integer size.

EDIT: As other users indicated in their answers, read the file in block instead per-byte:

uint16_t CRC16_int(const char* filePath) {
    FILE *readFile;
    const uint8_t buf[1024];
    size_t len;
    uint16_t result = 0xffff;;

    /* Open a file for reading. */
    readFile = fopen(filePath, "rb");
    if (readFile == NULL) {
        exit(1); 
    }

    /* Read until EOF. */
    while ( (len = fread(buf, sizeof(buf), 1, readFile)) > 0 ) {
        result = Crc16(buf, len, 0xA001, result);
    }

    /* readFile could be in error state, check it with ferror() or feof() functions. */

    return result;
}

Also you should alter you function prototype to make it possible to return an error, e.g.:

// Return true when successful, false on error. CRC is stored in result.
bool CRC16_int(const char* filePath, uint16_t *result)

这篇关于哪种数据类型对于任何类型的文件的CRC16来说都更好的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆