C#二进制树和字典 [英] C# Binary Trees and Dictionaries

查看:192
本文介绍了C#二进制树和字典的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力使用二进制搜索树和何时使用字典的概念。



在我的应用程序中,我做了一个使用C5的实验库 TreeDictionary (我相信是一个红黑二叉搜索树)和C#字典。字典在添加/查找操作中总是更快,并且总是使用更少的内存空间。例如,在16809 < int,float> 条目中,字典使用342 KiB,而树使用723 KiB。



我认为BST应该是更高的内存效率,但似乎树的一个节点比字典中的一个条目需要更多的字节。是什么赋予了? BST是否比字典更好?另外,作为一个侧面的问题,有没有人知道是否有更快的+更多的内存有效数据结构来存储< int,float> 对于所提到的结构中的字典类型访问对?

解决方案


我认为BST应该是
更高的内存效率,但是似乎
,树的一个节点比
字典中的一个条目需要
多个字节。是什么赋予了? BST是否比
字典好吗?

我个人从未听说过这样的原则。尽管如此,它只是一个普遍的原则,而不是在宇宙的结构中蚀刻出一个绝对的事实。



一般来说,字典实际上只是一个围绕一系列链接的花式封装名单。您插入到字典中,如:

  LinkedList< Tuple< TKey,TValue>> list = 
internalArray [internalArray%key.GetHashCode()];
if(list.Exists(x => x.Key == key))
throw new Exception(Key already exists);
list.AddLast(Tuple.Create(key,value));

所以它的几乎 O(1)操作。该字典使用O(internalArray.Length + n)内存,其中n是集合中的项目数。



通常BST可以实现为:




  • 链接列表,使用O(n)空格,其中n是集合中的数字项目。

  • 数组,它们使用O(2 h - n)空间,其中h是树的高度,n是集合中的项目数。


    • 由于红黑树具有O(1.44 * n)的有界高度,因此阵列实现应该具有约O(2 < sup> 1.44n - n)




赔率是C5 TreeDictionary是使用数组实现的,这可能是浪费的空间。


什么给了?有没有一点,在
BST比字典更好?


字典有一些不希望的属性:




  • 即使其内存要求远远低于总可用内存空间,也可能没有足够的稳定的内存块来容纳您的字典。 / p>


  • 评估散列函数可能需要任意长的时间。例如,字符串使用Reflector来检查 System.String.GetHashCode 方法 - 你会注意到,一个字符串的散列总是需要O(n)个时间,这意味着它可以花很长时间的长串。在手,比较不等式的字符串几乎总是比哈希更快,因为它可能需要看前几个字符。如果哈希代码评估需要太长时间,它完全可能的树插入速度比字典插入速度要快。




    • Int32的 GetHashCode 方法从字面上只是返回这个,所以你很难找到一个使用int键的哈希表比一个树字典慢的情况。




RB树具有一些可取的属性:




  • 与使用字典的O(n)时间相比,您可以在O(log n)时间内找到/删除Min和Max元素。


  • 如果一棵树被实现为链表,而不是一个数组,树通常比字典更空间有效。

    / li>
  • 同样,它的可笑容易写入不可变版的树,支持在O(log n)时间内插入/查找/删除。字典不适应不变性,因为您需要为每个操作复制整个内部数组(实际上,我看到一些基于数组的不可变指树的实现,一种通用字典数据结构,但实现非常复杂)


  • 您可以按常量空间和O(n)时间的顺序遍历树中的所有元素,而您需要将哈希表转储到数组中并进行排序才能获得相同的效果。




所以,数据结构的选择实际上取决于你需要什么属性。如果您只想要一个无序的包,并且可以保证您的散列函数快速评估,请使用.Net字典。如果您需要一个有序的包或者运行缓慢的散列函数,请使用TreeDictionary。


I'm struggling with the concept of when to use binary search trees and when to use dictionaries.

In my application I did a little experiment which used the C5 library TreeDictionary (which I believe is a red-black binary search tree), and the C# dictionary. The dictionary was always faster at add/find operations and also always used less memory space. For example, at 16809 <int, float> entries, the dictionary used 342 KiB whilst the tree used 723 KiB.

I thought that BST's were supposed to be more memory efficient, but it seems that one node of the tree requires more bytes than one entry in a dictionary. What gives? Is there a point at where BST's are better than dictionaries?

Also, as a side question, does anyone know if there is a faster + more memory efficient data structure for storing <int, float> pairs for dictionary type access than either of the mentioned structures?

解决方案

I thought that BST's were supposed to be more memory efficient, but it seems that one node of the tree requires more bytes than one entry in a dictionary. What gives? Is there a point at where BST's are better than dictionaries?

I've personally never heard of such a principle. Even still, its only a general principle, not a categorical fact etched in the fabric of the universe.

Generally, Dictionaries are really just a fancy wrapper around an array of linked lists. You insert into the dictionary something like:

LinkedList<Tuple<TKey, TValue>> list =
    internalArray[internalArray % key.GetHashCode()];
if (list.Exists(x => x.Key == key))
    throw new Exception("Key already exists");
list.AddLast(Tuple.Create(key, value));

So its nearly O(1) operation. The dictionary uses O(internalArray.Length + n) memory, where n is number of items in the collection.

In general BSTs can be implemented as:

  • linked-lists, which use O(n) space, where n is the number items in the collection.
  • arrays, which use O(2h - n) space where h is the height of the tree and n is the number of items in the collection.
    • Since red-black trees have a bounded height of O(1.44 * n), an array implementation should have a bounded memory usage of about O(21.44n - n)

Odds are, the C5 TreeDictionary is implemented using arrays, which is probably responsible for the wasted space.

What gives? Is there a point at where BST's are better than dictionaries?

Dictionaries have some undesirable properties:

  • There may not be enough continugous blocks of memory to hold your dictionary, even if its memory requirements are much less than than the total available RAM.

  • Evaluating the hash function can take an arbitrarily long length of time. Strings, for example, use Reflector to examine the System.String.GetHashCode method -- you'll notice hashing a string always takes O(n) time, which means it can take considerable time for very long strings. On the hand, comparing strings for inequality almost always faster than hashing, since it may require looking at just the first few chars. Its wholly possible for tree inserts to be faster than dictionary inserts if hash code evaluation takes too long.

    • Int32's GetHashCode method is literally just return this, so you'd be hardpressed to find a case where a hashtable with int keys is slower than a tree dictionary.

RB Trees have some desirable properties:

  • You can find/remove the Min and Max elements in O(log n) time, compared to O(n) time using a dictionary.

  • If a tree is implemented as linked list rather than an array, the tree is usually more space efficient than a dictionary.

  • Likewise, its ridiculous easy to write immutable versions of trees which support insert/lookup/delete in O(log n) time. Dictionaries do not adapt well to immutability, since you need to copy the entire internal array for every operation (actually, I have seen some array-based implementations of immutable finger trees, a kind of general purpose dictionary data structure, but the implementation is very complex).

  • You can traverse all the elements in a tree in sorted order in constant space and O(n) time, whereas you'd need to dump a hash table into an array and sort it to get the same effect.

So, the choice of data structure really depends on what properties you need. If you just want an unordered bag and can guarantee that your hash function evaluate quickly, go with a .Net Dictionary. If you need an ordered bag or have a slow running hash function, go with TreeDictionary.

这篇关于C#二进制树和字典的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆