比较两个文件 [英] Compare two files

查看:103
本文介绍了比较两个文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,我试图写函数,它比较两个文件的内容,但我不工作。

So, I'm trying to write function, which compares content of two files, but I doesn't work.

我想它返回1如果文件如果不同,则为0。

I want it to return 1 if files are the same, and 0 if different.

ch1和ch2用作缓冲区,我使用fgets获取文件的内容。

ch1 and ch2 works as a buffer, and I used fgets to get content of my files.

我认为eof指针有问题,但我不确定。 FILE变量在命令行中给出。

I think there is something wrong with the eof pointer, but I'm not sure. FILE variables are given within command line.

它适用于大小小于64KB的小文件,但会获得大文件大小的文件(例如700MB的电影或5MB的mp3)。

P.S. It works with small files with size under 64KB, but gets anyoned with larger files (700MB movies for example, or 5MB mp3).

任何想法,如何工作

int compareFile(FILE* file_compared, FILE* file_checked)
{
    bool diff = 0;
    int N = 65536;
    char* b1 = (char*) calloc (1, N+1);
    char* b2 = (char*) calloc (1, N+1);
    size_t s1, s2;

    do {
        s1 = fread(b1, 1, N, file_compared);
        s2 = fread(b2, 1, N, file_checked);

        if (s1 != s2 || memcmp(b1, b2, s1)) {
            diff = 1;
            break;
        }
      } while (!feof(file_compared) || !feof(file_checked));

    free(b1);
    free(b2);

    if (diff) return 0;
    else return 1;
}

编辑:我已经改进了这个功能,但它只是比较第一个缓冲区 - >但异常 - >我发现它停止读取文件,直到它达到1A字符(附加文件)。我们如何使它工作?

I've improved this function with inclusion of Your anwers. But it's only comparing first buffer only -> but with exception -> I figured out that it stops reading file until it reaches 1A character (attached file). How can We make it work?

EDIT2:任务解决(附带工作代码)。感谢大家的帮助!

Task solved (working code attached). Thanks to everyone for help!

推荐答案

由于你已经在堆栈上分配了数组,它们被随机填充。

Since you've allocated your arrays on the stack, they are filled with random values ... they aren't zeroed out.

其次, strcmp 只会比较第一个NULL值,如果它是一个二进制文件,不一定是在文件的结尾。因此,您应该在缓冲区上使用 memcmp 。但是同样,这将产生不可预测的结果,因为您的缓冲区分配在堆栈上,所以即使你比较相同的文件,缓冲区的结束通过EOF可能不一样,因此 memcmp 仍然会报告错误的结果(即,它很可能会报告文件是不一样的,因为它们是因为缓冲区结束时的随机值超过每个相应的文件EOF)。

Secondly, strcmp will only compare to the first NULL value, which, if it's a binary file, won't necessarily be at the end of the file. Therefore you should really be using memcmp on your buffers. But again, this will give unpredictable results because of the fact that your buffers were allocated on the stack, so even if you compare to files that are the same, the end of the buffers past the EOF may not be the same, so memcmp will still report false results (i.e., it will most likely report that the files are not the same when they are because of the random values at the end of the buffers past each respective file's EOF).

为了解决这个问题,你应该首先测量文件的长度,首先迭代文件,看看文件以字节为单位的长度,然后使用 malloc calloc 分配要比较的缓冲区,并重新填充这些缓冲区实际文件的内容。然后你应该能够对每个文件的二进制内容进行有效的比较。你也可以使用大于64K的文件,因为你在运行时动态分配缓冲区。

To get around this issue, you should really first measure the length of the file by first iterating through the file and seeing how long the file is in bytes, and then using malloc or calloc to allocate the buffers you're going to compare, and re-fill those buffers with the actual file's contents. Then you should be able to make a valid comparison of the binary contents of each file. You'll also be able to work with files larger than 64K at that point since you're dynamically allocating the buffers at run-time.

这篇关于比较两个文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆