使用openssl正确获取sha-1文件 [英] Correctly getting sha-1 for files using openssl

查看:179
本文介绍了使用openssl正确获取sha-1文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图获取一个sha-1的一些文件。我当前做的是循环文件在给定的路径,打开和读取每个文件单独和加载内容在缓冲区,然后发送到openssl的SHA函数,以获取散列。代码如下:

I am trying to get an sha-1 for a number of files. What I currently do is cycle the files in a given path, open and read each file separately and load the contents in a buffer and then send it to openssl's SHA function to get the hash. The code looks something like this:

    void ReadHashFile(LPCTSTR name)
{
 FILE * pFile;
 long lSize;
 char * buffer;
 size_t result;

 pFile = _tfopen ( name , L"rb" );
 if (pFile==NULL) {fputs ("File error",stderr); return;}

 // obtain file size:
 fseek (pFile , 0 , SEEK_END);
 lSize = ftell (pFile);
 rewind (pFile);

 if(lSize == -1){fputs ("Read Error",stderr);return;}

 // allocate memory to contain the whole file:
 buffer = (char*) malloc (sizeof(char)*lSize);
 if (buffer == NULL) {fputs ("Memory error",stderr); return;}

 // copy the file into the buffer:
 result = fread (buffer,1,lSize,pFile);
 if (result != lSize) {fputs ("Reading error",stderr); return;}

 /* the whole file is now loaded in the memory buffer. */

 // terminate
 fclose (pFile);

 //Do what ever with buffer
 unsigned char ibuf[] = "compute sha1";
 unsigned char obuf[20];

 SHA1((const unsigned char*)buffer, strlen((const char*)buffer), obuf);
 fwprintf(stderr, L"file %s\n", name);
 int i;
 for (i = 0; i < 20; i++) {
  printf("%02x ", obuf[i]);
 }
 printf("\n");


 free(buffer);
}

一些文件似乎不可读,有些给我一个-1大小别人只能读取前2-3个字节,它们给出了很多文件,即使它们是不同的。

Some files seem to be unreadable, some give me a -1 size others I can only read the first 2-3 bytes which gives a lot of files the same sha even though they are different.

如果有人可以帮助我或者任何人有文件哈希的经验。哦,有一种方式得到一个文件的sha1没有加载整个文件在内存中,我的意思是考虑大文件,这个解决方案不能工作。

I would appreciate it if someone can help me with this or if anyone has experience in file hashing. Oh and is there a way of getting a file's sha1 without loading the entire file in memory first, I mean considering large files, this solution wont work.

注意

推荐答案

如果您在读取文件内容时遇到问题,之前要调用散列函数代码,

If you get trouble reading the file contents, prior to invoking the hash function code, then your problem is not related to hashing.

您应该使用标准的 fopen()函数,而不是 _tfopen()。在C中,以下划线字符开头的东西通常是最好避免的。特别是因为 _tfopen()似乎映射到 fopen() _wfopen(),取决于是否激活所谓的unicode支持。或者,在纯Windows应用程序中,您可以依赖于 CreateFile()等Win32函数。

You should use the standard fopen() function, rather than _tfopen(). In C, things which begin with an underscore character are often best avoided. Especially since _tfopen() seems to map to either fopen() or the Windows-specific _wfopen() depending on whether so-called "unicode support" is activated. Alternatively, in a purely Windows application, you may rely on Win32 functions such as CreateFile().

整个文件在内存中,然后散列它是粗糙的。例如,它将无法处理大于可用RAM的文件。此外,为了知道文件大小,你必须寻找到它,这是不可靠的(可能有伪文件,实际上是管道到一些数据生成过程,为此寻找是不可能的)。哈希函数可以通过块处理数据;你应该使用一个小缓冲区(传统大小为8 kB),并使用 SHA1_Init() SHA1_Update() SHA1_Final()函数。

Reading the whole file in memory and then hashing it is crude. It will fail to process files which are larger than available RAM, for instance. Also, in order to know the file size, you have to seek into it, which is not reliable (there may be pseudo-files which are actually pipes into some data-generating process, for which seeking is not possible). Hash functions can process data by chunks; you should use a small buffer (8 kB is the traditional size) and employ the SHA1_Init(), SHA1_Update() and SHA1_Final() functions.

fread()不一定读取您所要求的数据。

fread() does not necessarily read as much data as you requested. And this is not an error.

当您调用 SHA1()时,使用 strlen()在你的缓冲区,这是假的。 strlen()返回字符串的长度;以明文形式,直到值为零的下一个字节的字节数。许多文件包含值为0的字节。如果文件不存在,那么不能保证您的缓冲区包含任何值为0的字节,因此调用 strlen()可能会在分配的缓冲区外部读取内存(这是)。因为你遇到了获取文件长度和分配一个大的缓冲区的麻烦,你应该至少使用这个长度,而不是尝试重新计算它的函数,不这样做。

When you call SHA1(), you use strlen() on your buffer, which is bogus. strlen() returns the length of a character string; in plain words, the number of bytes until the next byte of value zero. Many files contain bytes of value 0. And if the file does not, then there is no guarantee that your buffer contains any byte of value 0, so that the call to strlen() may end up reading memory outside of the allocated buffer (this is bad). Since you went to the trouble of obtaining the file length and allocating a buffer that big, you should at least use that length instead of trying to recompute it with a function which does not do that.

总结:你的代码应该看起来像这样(未测试):

To sum up: your code should look like that (untested):

/*
 * Hash a file, which name is given. Hash output is written out in
 * buffer "out[]". The hash output consists in exactly 20 bytes.
 * On success, 0 is returned; on error, returned value is -1 and
 * out[] is unaltered.
 */
int
do_sha1_file(char *name, unsigned char *out)
{
    FILE *f;
    unsigned char buf[8192];
    SHA_CTX sc;
    int err;

    f = fopen(name, "rb");
    if (f == NULL) {
        /* do something smart here: the file could not be opened */
        return -1;
    }
    SHA1_Init(&sc);
    for (;;) {
        size_t len;

        len = fread(buf, 1, sizeof buf, f);
        if (len == 0)
            break;
        SHA1_Update(&sc, buf, len);
    }
    err = ferror(f);
    fclose(f);
    if (err) {
        /* some I/O error was encountered; report the error */
        return -1;
    }
    SHA1_Final(out, &sc);
    return 0;
}

并且不要忘记包含相关的文件头! (< stdio.h> 和来自OpenSSL的 sha.h

And do not forget to include the relevant file headers ! (<stdio.h>, and the sha.h from OpenSSL)

这篇关于使用openssl正确获取sha-1文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆