一致地生成对象的哈希 [英] Generate hash of object consistently

查看:104
本文介绍了一致地生成对象的哈希的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试获取对象的哈希(md5或sha).

I'm trying to get a hash (md5 or sha) of an object.

我已经实现了这一点: http://alexmg.com/post/2009/04/16/计算C.aspx中任何对象的任何哈希值

I've implemented this: http://alexmg.com/post/2009/04/16/Compute-any-hash-for-any-object-in-C.aspx

我正在使用nHibernate从数据库中检索我的POCO.
在此基础上运行GetHash时,每次从数据库中选择它并进行水合操作时都会有所不同.我猜这是预料之中的,因为底层代理将发生变化.

I'm using nHibernate to retrieve my POCOs from a database.
When running GetHash on this, it's different each time it's selected and hydrated from the database. I guess this is expected, as the underlying proxies will change.

反正

是否有一种方法可以始终获取对象上所有属性的哈希值?

Is there a way to get a hash of all the properties on an object, consistently each time?

我很喜欢在this.GetType().GetProperties .....上使用StringBuilder并在其上创建哈希的想法,但这似乎效率不高?

I've toyed with the idea of using a StringBuilder over this.GetType().GetProperties..... and creating a hash on that, but that seems inefficient?

请注意,这是为了将这些实体从一个数据库(RDBMS)更改跟踪到NoSQL存储 (比较散列值以查看对象是否在rdbms和nosql之间更改)

As a side note, this is for change-tracking these entities from one database (RDBMS) to a NoSQL store (comparing hash values to see if objects changed between rdbms and nosql)

推荐答案

如果不覆盖GetHashCode,则只需继承Object.GetHashCode. Object.GetHashCode基本上只是返回实例的内存地址(如果它是引用对象).当然,每次加载对象时,它很可能会加载到内存的不同部分,从而导致不同的哈希码.

If you're not overriding GetHashCode you just inherit Object.GetHashCode. Object.GetHashCode basically just returns the memory address of the instance, if it's a reference object. Of course, each time an object is loaded it will likely be loaded into a different part of memory and thus result in a different hash code.

这是否是正确的说法尚待商;;但这就是追溯到过去"的实现方式,因此现在无法更改.

It's debatable whether that's the correct thing to do; but that's what was implemented "back in the day" so it can't change now.

如果要保持一致,则必须覆盖GetHashCode并根据对象的值"(即属性和/或字段)创建代码.这可以像所有属性/字段的哈希码的分布式合并一样简单.或者,它可能会像您需要的那样复杂. 如果您要查找的是能够区分两个不同对象的东西,那么在对象上使用唯一键可能对您有用.如果您要查找变更跟踪,请为该对象使用唯一键哈希可能无法正常工作

If you want something consistent then you have to override GetHashCode and create a code based on the "value" of the object (i.e. the properties and/or fields). This can be as simple as a distributed merging of the hash codes of all the properties/fields. Or, it could be as complicated as you need it to be. If all you're looking for is something to differentiate two different objects, then using a unique key on the object might work for you.If you're looking for change tracking, using the unique key for the hash probably isn't going to work

我仅使用字段的所有哈希码来为父对象创建合理分布的哈希码.例如:

I simply use all the hash codes of the fields to create a reasonably distributed hash code for the parent object. For example:

public override int GetHashCode()
{
    unchecked
    {
        int result = (Name != null ? Name.GetHashCode() : 0);
        result = (result*397) ^ (Street != null ? Street.GetHashCode() : 0);
        result = (result*397) ^ Age;
        return result;
    }
}

使用质数397可以为值生成唯一数,以更好地分发哈希码.参见 http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/了解有关在哈希码计算中使用质数的更多详细信息.

The use of the prime number 397 is to generate a unique number for a value to better distribute the hash code. See http://computinglife.wordpress.com/2008/11/20/why-do-hash-functions-use-prime-numbers/ for more details on the use of primes in hash code calculations.

当然,您可以使用反射来获取所有属性来执行此操作,但这会比较慢.或者,您可以使用 CodeDOM 来动态生成代码,以基于对属性的反映来生成哈希并缓存该代码(即生成一次并在下次重新加载).但是,这当然很复杂,可能不值得付出努力.

You could, of course, use reflection to get at all the properties to do this, but that would be slower. Alternatively you could use the CodeDOM to generate code dynamically to generate the hash based on reflecting on the properties and cache that code (i.e. generate it once and reload it next time). But, this of course, is very complex and might not be worth the effort.

MD5或SHA哈希或CRC通常基于数据块.如果需要的话,那么使用每个属性的哈希码是没有意义的.如Henk所述,可能将数据序列化到内存并以这种方式计算散列会更适用.

An MD5 or SHA hash or CRC is generally based on a block of data. If you want that, then using the hash code of each property doesn't make sense. Possibly serializing the data to memory and calculating the hash that way would be more applicable, as Henk describes.

这篇关于一致地生成对象的哈希的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆