使用.NET Framework 4.0的文件校验和性能比较 [英] File Checksum Performance Comparison using .NET Framework 4.0

查看:128
本文介绍了使用.NET Framework 4.0的文件校验和性能比较的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

大家好我是新手.我在网上遇到了这个问题.基本上,您需要使用.NET Framework 4.0比较和测试哈希算法的一般性能,即MD5和SHA1.
这是怎么回事:
1.编写一个方法,该方法接受文件的路径作为参数.
2.打开一个流以读取文件并计算哈希.
3.哈希结果通过BitConverter转换为字符串.
4.为每种被测试的算法克隆该方法:MD5和MD5. SHA1
5.每种算法都针对50K,100K,500K,1MB和5MB文件进行了测试.
6.计算给定文件的校验和1000次以获得合理的
测量.
7.将在.NET Framework 4.0的发布模式下编译的测试

因此,我想知道上述测试/程序是否可行,实用和实用.现实以及如何执行它.

如果有人可以对此有所了解,请对此表示赞赏.

非常感谢.
Giggsey

Hi guys. I''m a newbie. I came across this on the Web. Basically u need to compare and test the general performance of hash algorithms i.e MD5 and SHA1 using .NET Framework 4.0.
Here''s how it goes:
1. Write a a method that accepts the path to a file as a parameter.
2. A stream is opened to read the file and compute the hash.
3. The result of the hash is converted to a string via a BitConverter.
4. The method was cloned for each algorithm being tested: MD5 & SHA1
5. Each algorithm was tested against a 50K, 100K, 500K, 1MB, and 5MB file.
6. The checksum was calculated for the given file 1000 times to obtain a reasonable
measurement.
7. The test to be compiled in Release Mode on .NET Framework 4.0

So, I''m wondering whether the above test/program is feasible, practical & realistic and how to go about executing it.

Appreciate it if anyone can shed some light on this.

Many thanks.
Giggsey

推荐答案

如果您是新的int编程人员,那么这不是开始的地方.除此之外,我不确定我是否真的明白这一点.但是,如果您想做这样的事情,那么有一个开源应用程序可以做到这一点.您可以看一下代码,看看如何做,看看他们还做些什么:

http://hashlib.codeplex.com/ [ ^ ]

如果您想自己尝试一下,是可行的.我确实看不到很高的实用性,但这实际上是现实的.至于如何实际编写此代码,我认为您已经有了一个良好的开端.您已将应用程序分解为逻辑部分.将这些片段进一步分解为单独的方法.然后,实现每种方法.这样,当您遇到困难时,它将只占一小块,而不是整个应用程序.
If you are new int programming, this is not the place to start. Beyond that, I''m not sure I really see the point. However, if you wanted to do something like this, there is an open source application that does this. You could look at the code to see how and see what else they do:

http://hashlib.codeplex.com/[^]

If you want to attemp this yourself, yes it is feasible. I really don''t see a high level of practicality, but it is mostly realistic. As far as how to actually write this code, I think you have a good start already. You''ve got the application broken down into logical pieces. Break these pieces down further into individual methods. Then, implement each method. That way, when you get stuck, it will be on a small piece, not an entire application. That will make getting help easier.


这不仅可行,而且对.NET来说微不足道.

除了性能测试之外,您没有解释如何使用MD5和SHA1,因此我想从以下警告开始:MD5和SHA1都是 cryptographic hash函数的开发者,但都被发现坏了.这些功能均不得用于加密或与安全性有关的目的.请参阅:
http://en.wikipedia.org/wiki/Cryptographic_hash_function [ http://en.wikipedia.org/wiki/MD5 [ http://en.wikipedia.org/wiki/SHA-1 [ http://en.wikipedia.org/wiki/SHA2 [ http://msdn.microsoft.com/en-us/library/system. security.cryptography.hashalgorithm.aspx [ ^ ].

因此,您问题的主要部分得以解决.目前尚不清楚为什么需要使用BitConverter.您真的不需要它来完成此任务.所有算法都在输入和输出上使用字节数组.要生成一些人类可读的文本,可以使用byte.ToString输出每个字节,并使用byte.Parsebyte.TryParse进行读取.您还可以在XML文件中存储或加载任何数据结构(实际上,任何具有任意数据结构的对象的任意对象图),最好使用 Data Contract .请参阅:
http://msdn.microsoft.com/en-us/library/ms733127.aspx [ ^ ].

现在,要读取要散列的任意数据文件(可能只需要读取),则需要将其读取为二进制数据,然后可以使用类System.IO.BinaryReader:
http://msdn.microsoft.com/en-us/library/system.io. binaryreader.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/system. io.binaryreader.readbytes.aspx [ ^ ](这是您应使用的主要方法).

上面显示的方法将要求您知道文件大小,一次读取所有文件.这是通过获取阅读器的基本流作为属性BaseStream的值并使用其属性System.IO.Stream.Length来获取其长度来完成的.此属性并非始终实现,但是对于从读取器获得的输入流的实例,它将返回正确的值.请参阅:
http://msdn.microsoft.com/en-us/library/system. io.binaryreader.basestream.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/system.io. stream.aspx [^ ].



从任何文件读取所有字节的另一种更简单的方法是方法System.IO.File.ReadAllBytes:
http://msdn.microsoft.com/en-us/library/system. io.file.readallbytes.aspx [ ^ ].

[END EDIT]

最后,为了获得最精确的计时,您需要使用类System.Diagnostics.Stopwatch:
http://msdn.microsoft.com/en-us/library/system.diagnostics. stopwatch.aspx [ ^ ].

您可以通过以下方式评估可以依靠的准确性:
http://msdn.microsoft.com/en-us/library/system. diagnostics.stopwatch.ishighresolution.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/system. diagnostics.stopwatch.frequency.aspx [ ^ ].

另请参阅计时:
http://msdn.microsoft.com/en-us/library/system. diagnostics.stopwatch.elapsed.aspx [ ^ ],
http://msdn.microsoft.com/en-us/library/system.timespan.aspx [ ^ ].

这就是您所需要知道的.其他所有内容都微不足道,无法讨论,但是,如果您遇到任何问题,欢迎提出后续问题.

—SA
This is not just feasible, this is trivial with .NET.

You did not explain how do you want to use MD5 and SHA1 except performance testing, so I want to start with the following warning: both MD5 and SHA1 were developer as the cryptographic hash functions, but both were found broken. None of these functions should be used for cryptographic or security-related purposes. Please see:
http://en.wikipedia.org/wiki/Cryptographic_hash_function[^],
http://en.wikipedia.org/wiki/MD5[^],
http://en.wikipedia.org/wiki/SHA-1[^].

I would advise to use the cryptographic hash functions of the SHA-2 family. Please see:
http://en.wikipedia.org/wiki/SHA2[^].

Now, the implementations are readily available with .NET as they are bundled with each version of the Framework and placed in GAC. Please see:
http://msdn.microsoft.com/en-us/library/system.security.cryptography.hashalgorithm.aspx[^].

So, main part of your problem is solved. It is not clear why do you need to use BitConverter. You really never need it for this task. All algorithms work with arrays of bytes, on input and on output. To produce some human-readable text, you can output each byte using byte.ToString and read using byte.Parse or byte.TryParse. You can also store or load any data structure (actually, any arbitrary object graph with objects with any arbitrary data structures) in an XML file, preferably using Data Contract. Please see:
http://msdn.microsoft.com/en-us/library/ms733127.aspx[^].

Now, for reading arbitrary data files to be hashed (probably, you only need reading), you need to read then as binary data, which you can do with the class System.IO.BinaryReader:
http://msdn.microsoft.com/en-us/library/system.io.binaryreader.aspx[^],
http://msdn.microsoft.com/en-us/library/system.io.binaryreader.readbytes.aspx[^] (this is the main method you should use).

The method shown above will require you to know the file size, to read all at once. This is done by getting the base stream of a reader as a value of the property BaseStream, and getting its length using its property System.IO.Stream.Length. This property is not always implemented, but for an instance of an input stream obtained from the reader, it will return correct value. Please see:
http://msdn.microsoft.com/en-us/library/system.io.binaryreader.basestream.aspx[^],
http://msdn.microsoft.com/en-us/library/system.io.stream.aspx[^].



Another, easier way of reading all bytes from any file is the method System.IO.File.ReadAllBytes:
http://msdn.microsoft.com/en-us/library/system.io.file.readallbytes.aspx[^].

[END EDIT]

And, finally, to get most precise timing, you need to use the class System.Diagnostics.Stopwatch:
http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.aspx[^].

This is how you can assess what accuracy you can rely upon:
http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.ishighresolution.aspx[^],
http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.frequency.aspx[^].

See also on timing:
http://msdn.microsoft.com/en-us/library/system.diagnostics.stopwatch.elapsed.aspx[^],
http://msdn.microsoft.com/en-us/library/system.timespan.aspx[^].

This is all you need to know. Everything else is too trivial to discuss, but, if you face any problems, you are welcome to ask a follow-up question.

—SA


这篇关于使用.NET Framework 4.0的文件校验和性能比较的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆