评论如何在C#/ hash table / get哈希代码中唯一地跟踪对象 [英] Comment on how to uniquely track your objects in C# / hash table /get hash code

查看:60
本文介绍了评论如何在C#/ hash table / get哈希代码中唯一地跟踪对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个快速的理智检查,我认为我是正确的,但只是为了确保:如果你有一堆非常像一个对象

另一个你可以通过使用ArrayList或

数组来唯一地跟踪它们,对吗?一个例子:创建对象,创建一个数组,

将对象填充到数组中。稍后,假设对象是可变的,对象会发生变化,但是你可以找到它,如果你有足够的状态信息可以唯一地识别它(理想的情况是

存储在对象本身上),使用该对象的状态

信息删除它等。但是为了找到数组中的对象或

数组列表,你必须遍历整个列表,foreach,更糟糕的情况,

O(n)或类似的东西哦,符号,对吗?


现在这个方法很简单,但很慢,不是吗?特别是如果你要进行排序并进行大量的发现。因此,我们使用

哈希表来加速对象的搜索和排序,以及

" find"或包含或对象,是吗?


但是,正如下面的链接所指出的,哈希函数可以是非唯一的b $ b,或者与两个不同的对象进行冲突相同的

哈希(很少见,但也会发生),而且你必须自己编写

gethascode,比如你是否重载了Equals实现的

ArrayList例如。


我见过覆盖GetHashCode的例子,并了解它是如何工作的(有点儿,至少它是这样的)感觉),但我的问题/评论是

,如果你不关心性能,那么一个简单的指数基于
的检索方案是最好的。


纠正我,如果我错了。


RL

http://www.interact-sw.co.uk/iangblo...21/gethashcode (好的

如何哈希不是唯一的)

http://msdn.microsoft.com/en-us/libr...thashcode.aspx

(如何为你的课程重载gethash和其他东西,因为你
不能依赖于默认的实现)

解决方案

raylopez99写道:


快速完整的检查,我认为我是正确的,但只是为了使

肯定:如果你有一堆非常类似的对象,你可以通过使用ArrayList或

数组来唯一地跟踪它们,对吗?一个例子:创建对象,创建一个数组,

将对象填充到数组中。稍后,假设对象是可变的,对象会发生变化,但是你可以找到它,如果你有足够的状态信息可以唯一地识别它(理想的情况是

存储在对象本身上),使用该对象的状态

信息删除它等。但是为了找到数组中的对象或

数组列表,你必须遍历整个列表,foreach,更糟糕的情况,

O(n)或类似的东西哦,符号,对吗?


现在这个方法很简单,但很慢,不是吗?特别是如果你要进行排序并进行大量的发现。因此,我们使用

哈希表来加速对象的搜索和排序,以及

" find"或包含或对象,是吗?



是。


但是,如下面的链接所指出的,哈希函数可以是非

唯一,或者与具有相同

哈希的两个不同对象发生冲突(很少见,但很少发生),而且你必须自己编写

gethascode,例如你在一个

ArrayList中重载Equals实现。


我见过覆盖GetHashCode的例子,并了解如何它是b $ b工作(有点,至少它是有意义的),但我的问题/评论是

,如果你不关心性能,那么一个简单的索引
基于
的检索方案是最好的。


如果我错了,请纠正我。



如果您必须编写自己的Hashtable或Dictionary<类,那么

就会有一点。


但是编写一个GetHashCode方法并不困难,所以如果你需要通过某种东西查找

,那么我会推荐Dictionary<>

无论如何。


Arne


还要注意,你想要改变对象的事实可能导致
$ b字典/任何容器中的$ b问题,如果用于计算哈希值的字段本身是可变的。


例如,说你的对象是一个只有一个字符串的类,并且对象返回的哈希码是字符串的哈希码。存储

字典中的对象,更改字符串,你的对象在字典中丢失了(也就是说,你再也找不到它了,因为

查找基于新的哈希码。


简而言之,计算哈希码所涉及的所有字段必须是

不可变(即这就是为什么.Net框架使用内部引用

对象的数量:它在

整数范围内的传播方面并不理想,但它这是不可改变的。


如果你的所有字段都可以变异,那么你几乎坚持使用ArrayList解决方案的

和手册查找和O(n)。但如果

你不关心表现,那就继续吧!


米歇尔

5岁以上, 01:18,Arne Vajh?j< a ... @ vajhoej.dkwrote:


raylopez99写道:


一个快速的健全检查,我认为我是正确的,但只是为了确保:*如果你有一堆非常像一个对象的商品

另一个你可以通过使用ArrayList或

数组来唯一地跟踪它们,对吗? *一个例子:*创建对象,创建一个数组,

将对象填充到数组中。 *稍后,假设对象是可变的,对象会发生变化,但是你可以找到它,如果你有足够的

状态信息来唯一地识别它(理想情况下是将

存储在对象本身上),删除它等等,使用该对象的状态

信息。 *但要找到数组中的对象或

数组列表,你必须遍历整个列表,foreach,更糟糕的情况,

O(n)或类似的东西大哦记谱,对吗?


现在这个方法很简单但很慢,不是吗? *特别是如果你要进行排序并进行大量的发现。因此,我们使用

哈希表来加速对象的搜索和排序,以及

" find"或包含或对象,是吗?



是。


但是,如下面的链接所指出的,哈希函数可以是非

唯一,或者与具有相同

哈希的两个不同对象发生冲突(很少见,但很少发生),而且你必须自己编写

gethascode,例如,如果你在一个

ArrayList中重载Equals实现。


我见过覆盖GetHashCode的例子,并了解它是如何工作的(有点儿,至少它是这样的)感觉),但我的问题/评论是

,如果你不关心性能,那么一个简单的指数基于
的检索方案是最好的。


如果我错了,请纠正我。



如果您必须编写自己的Hashtable或Dictionary<类,那么

就会有一点。


但是编写一个GetHashCode方法并不困难,所以如果你需要通过某种东西查找

,那么我会推荐Dictionary<>

无论如何。


Arne- Masquer le texte desmessagesprécédents -


- Afficher le texte desmessagesprécédents -


8月5日,2:25 * am,raylopez99< raylope ... @ yahoo.comwrote:


一个快速的理智检查,我认为我是正确的,但只是为了确保:*如果你有一堆非常像一个对象的物品

另一个你只需使用ArrayList或

数组就可以唯一地跟踪它们,对吗?



你可以唯一地跟踪任何引用类型的实例,无论是什么数据都在里面,因为它有身份,可以是使用Object.ReferenceEquals()比较

。正是这种身份,

作为GetHashCode()和

Equals()的默认实现的基础,而且通常是足够好。


但是,正如下面的链接所指出的,散列函数可以是非唯一的
,或者与具有相同的两个不同对象发生冲突br />
哈希(罕见,但发生),



这与性能问题基本无关,因为哈希
$ b $即使碰撞,b表也能正常工作 - 只是速度慢。你仍然保证在随机输入上平均有O(1),但

病理情况可以变成O(n)。


我见过覆盖GetHashCode的例子并理解它是如何工作的(有点,至少它是有意义的),但我的问题/评论是

如果你不关心性能,那么一个简单的索引基于
的检索方案是最好的。



维护哈希表也有一些额外的开销,包括

插入和检索(计算哈希,平衡桶等)。 br />
在实践中,特别是对于小型集合(大约约100项),

,其中插入和查找经常发生,一个简单的列表,

线性在速度和内存方面,搜索可以比哈希表更有效。对于更大的东西,哈希表值得考虑 -

虽然即便如此,如果插入频繁且查找非常罕见,

列表可能仍然更好(但它不是通常的情况)。


另一方面,如果你填写一次收集,然后只做

查找,从不修改它,那么最多高效的实现是
预先对集合进行排序并对其进行二进制搜索

(Array.BinarySearch或List< T> .BinarySearch,根据需要 - 但如果是

集合是不可变的,也可以使它成为一个数组,用于给出轻微的

性能提升。


A quick sanity check, and I think I am correct, but just to make
sure: if you have a bunch of objects that are very much like one
another you can uniquely track them simply by using an ArrayList or
Array, correct? An example: create the object, create an array, the
stuff the object into the array. Later on, assume the object is
mutable, the object changes, but you can find it, if you have enough
state information to uniquely identify it (which ideally is being
stored on the object itself), delete it, etc using the state
information for that object. But to find the object in the array or
array list, you must traverse the entire list, foreach, worse case,
O(n) or something like that in big oh notation, right?

Now this method is foolproof but slow, no? Especially if you''re going
to be sorting and doing lots of finding. So for this reason, we use
hash tables to speed up searching and sorting of objects, and to
"find" or "contains" objects, yes?

But, as pointed out in the links below, hash functions can be non-
unique, or give collisions with two different objects having the same
hash (rare, but happens), and further you have to write your own
gethascode, such as if you overload the Equals implementation in a
ArrayList for example.

I''ve seen examples on overriding GetHashCode and understand how it
works (kind of, at least it made sense), but my question /comment is
that if you''re not concerned with performance, then a simple index
based retrieval scheme is best.

Correct me if I''m wrong.

RL

http://www.interact-sw.co.uk/iangblo...21/gethashcode (good
rant on how hash is not unique)

http://msdn.microsoft.com/en-us/libr...thashcode.aspx
(how to overload gethash and other stuff for your classes, since you
cannot rely on the default implementations)

解决方案

raylopez99 wrote:

A quick sanity check, and I think I am correct, but just to make
sure: if you have a bunch of objects that are very much like one
another you can uniquely track them simply by using an ArrayList or
Array, correct? An example: create the object, create an array, the
stuff the object into the array. Later on, assume the object is
mutable, the object changes, but you can find it, if you have enough
state information to uniquely identify it (which ideally is being
stored on the object itself), delete it, etc using the state
information for that object. But to find the object in the array or
array list, you must traverse the entire list, foreach, worse case,
O(n) or something like that in big oh notation, right?

Now this method is foolproof but slow, no? Especially if you''re going
to be sorting and doing lots of finding. So for this reason, we use
hash tables to speed up searching and sorting of objects, and to
"find" or "contains" objects, yes?

Yes.

But, as pointed out in the links below, hash functions can be non-
unique, or give collisions with two different objects having the same
hash (rare, but happens), and further you have to write your own
gethascode, such as if you overload the Equals implementation in a
ArrayList for example.

I''ve seen examples on overriding GetHashCode and understand how it
works (kind of, at least it made sense), but my question /comment is
that if you''re not concerned with performance, then a simple index
based retrieval scheme is best.

Correct me if I''m wrong.

If you had to write your own Hashtable or Dictionary<class, then
there would be a point.

But writing a GetHashCode method is not that difficult, so in case
you nee dto lookup by something, then I will recommend Dictionary<>
anyway.

Arne


Also be aware that the fact you want to mutate your object can cause
problems in the dictionary/whatever container, if the fields used to
calculate the hash are mutable themselves.

For instance, say your object is a class with only a string in it, and
the hashcode returned by the object is the string''s hashcode. Store
that object in the dictionary, change the string, and your object is
lost in the dictionary (that is, you cannot find it anymore, as the
lookup is based on the new hashcode).

In short, all the fields involved in calculating the hashcode must be
immutable (that''s why the .Net framework uses the internal reference
number of the object: it''s not ideal in terms of spreading accross the
integer range, but it''s immutable).

If all of your fields can mutate, then you''re pretty much stuck with
the ArrayList solution, and the manual lookups, and the O(n). But if
you''re not concerned with performance, then go ahead!

Michel
On 5 ao?t, 01:18, Arne Vajh?j <a...@vajhoej.dkwrote:

raylopez99 wrote:

A quick sanity check, and I think I am correct, but just to make
sure: *if you have a bunch of objects that are very much like one
another you can uniquely track them simply by using an ArrayList or
Array, correct? *An example: *create the object, create an array, the
stuff the object into the array. *Later on, assume the object is
mutable, the object changes, but you can find it, if you have enough
state information to uniquely identify it (which ideally is being
stored on the object itself), delete it, etc using the state
information for that object. *But to find the object in the array or
array list, you must traverse the entire list, foreach, worse case,
O(n) or something like that in big oh notation, right?

Now this method is foolproof but slow, no? *Especially if you''re going
to be sorting and doing lots of finding. So for this reason, we use
hash tables to speed up searching and sorting of objects, and to
"find" or "contains" objects, yes?


Yes.

But, as pointed out in the links below, hash functions can be non-
unique, or give collisions with two different objects having the same
hash (rare, but happens), and further you have to write your own
gethascode, such as if you overload the Equals implementation in a
ArrayList for example.

I''ve seen examples on overriding GetHashCode and understand how it
works (kind of, at least it made sense), but my question /comment is
that if you''re not concerned with performance, then a simple index
based retrieval scheme is best.

Correct me if I''m wrong.


If you had to write your own Hashtable or Dictionary<class, then
there would be a point.

But writing a GetHashCode method is not that difficult, so in case
you nee dto lookup by something, then I will recommend Dictionary<>
anyway.

Arne- Masquer le texte des messages précédents -

- Afficher le texte des messages précédents -


On Aug 5, 2:25*am, raylopez99 <raylope...@yahoo.comwrote:

A quick sanity check, and I think I am correct, but just to make
sure: *if you have a bunch of objects that are very much like one
another you can uniquely track them simply by using an ArrayList or
Array, correct?

You can uniquely track an instance of any reference type, no matter
what data is inside, because it has identity, which can be compared
against by using Object.ReferenceEquals(). It is this identity that
serves as a basis for default implementations of GetHashCode() and
Equals(), and more often than not it is good enough.

But, as pointed out in the links below, hash functions can be non-
unique, or give collisions with two different objects having the same
hash (rare, but happens),

This is largely irrelevant to the issue of performance, since hash
tables work correctly even with collisions - just slower. You''re still
guaranteed to have O(1) on average on random input, though
pathological cases can become O(n).

I''ve seen examples on overriding GetHashCode and understand how it
works (kind of, at least it made sense), but my question /comment is
that if you''re not concerned with performance, then a simple index
based retrieval scheme is best.

Maintaining a hash table also has some additional overhead, both on
insertion and on retrieval (computing hashes, balancing buckets etc).
In practice, particularly for small collections (roughly ~100 items),
where both insertions and lookups happen frequently, a plain list with
linear search can be more efficient than a hashtable both in terms of
speed and memory. For larger stuff, hashtable is worth considering -
though even then, if inserts are frequent and lookups are very rare, a
list is probably still better (but it is not a usual case).

On the other hand, if you fill the collection once, and then just do
lookups, never modifying it, then the most efficient implementation is
pre-sorting the collection and doing binary search on it
(Array.BinarySearch or List<T>.BinarySearch, as needed - but if a
collection is immutable, might as well make it an array for the slight
performance increase that gives).


这篇关于评论如何在C#/ hash table / get哈希代码中唯一地跟踪对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆