什么是C ++组和unordered_set之间的区别? [英] what is the difference between set and unordered_set in C++?

查看:1125
本文介绍了什么是C ++组和unordered_set之间的区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

碰到这个问题很好,这是类似的,但并不相同的,因为它谈论Java,它有不同的实现哈希表,通过具有同步存取器/变化器的 <一href="http://stackoverflow.com/questions/40471/differences-between-hashmap-and-hashtable">Differences HashMap和Hashtable的?

Came across this good question, which is similar but not at all same since it talks about Java, which has different implementation of hash-tables, by virtue of having synchronized accessor /mutators Differences between HashMap and Hashtable?

那么,什么是用C ++实现集和unordered_set的区别? 这个问题可以ofcourse扩展映射VS unordered_map等其他C ++的容器。

So what is the difference in C++ implementation of set and unordered_set ? This question can be ofcourse extended to map vs unordered_map and so on for other C++ containers.

下面是我的初步评估

设置:在标准犯规明确要求它实现为树,时间复杂度的限制要求其业务的查找/插入,意味着它会永远被实现为树。 一般作为铷树(如在GCC 4.8所示),这是高度平衡。 由于它们的高度平衡,他们有predictable的时间复杂度为查找()

set : While standard doesnt explicitly asks it to be implemented as trees, the time-complexity constraint asked for its operations for find/insert, means it will always be implemented as tree. Usually as RB tree (as seen in GCC 4.8), which is height-balanced. Since they are height balanced, they have predictable time-complexity for find()

优点:紧凑型(相比比较其他DS)

Pros : Compact (compared to other DS in comparison)

缺点:访问时间复杂度为O(LG N)

Con : Access time complexity is O(lg n)

unordered_set :在标准犯规明确要求它实现为树,时间复杂度的限制要求其业务的查找/插入,意味着它会永远被实现为哈希表

unordered_set : While standard doesnt explicitly asks it to be implemented as trees, the time-complexity constraint asked for its operations for find/insert, means it will always be implemented as hash-table.

优点:

  1. 更快(承诺摊销O(1)搜索)
  2. 易于相比,树DS
  3. 转换基本原语线程安全的,
  1. Faster (promises amortized O(1) for search)
  2. Easy to convert basic primitives to thread-safe, as compared to tree-DS

缺点:

  1. 查找,但不保证O(1)Therotical最坏情况下为O(n)
  2. 不一样紧凑树。 (出于实用目的客座率从来都不是1)

请注意: 在O(1),对哈希表来自于假设不存在任何冲突。即使.5负荷因子,每一秒变量插入导致冲突。 它可以观察到哈希表的负载因数成反比所需在它访问元件操作的​​次数。更多地降低#operations,稀疏的哈希表。当所存储的元是大小相当于指针,则开销是相当显著

Note : The O(1), for hashtable comes from the assumption that there are no collision. Even with load-factor of .5, every second variable insertion is leading to collision. It could be observed that the load-factor of hash-table is inversely proportional to the number of operations required for accessing a element in it. More we reduce #operations, sparser hash-table. When the element stored are of size comparable to pointer, then overhead is quite significant.

编辑:由于大部分都在说的问题包含足够的答案的话,我改变的问题  我错过图/性能分析的人应该知道的设置有什么区别?

Edit : Since most are saying question contains sufficient answer in it, I am changing the question to "Did I miss any difference between map/set for performance analysis that one should know ??"

推荐答案

我觉得你通常回答了你自己的问题,但是,这样的:

I think you've generally answered your own question, however, this:

不一样紧凑树。 (出于实用目的客座率从来都不是1)

Not as compact as tree. (for practical purposes load factors is never 1)

未必是真实的。树中的每个节点(我们假设它是一个红黑树)的一种类型 T 利用空间,至少等于 2 * pointer_size +的sizeof(T)+的sizeof(布尔)。这可能是 3 *指针大小根据树是否包含指针每个树节点。

is not necessarily true. Each node of a tree (we'll assume it's a red-black tree) for a type T utilizes space that is equal to at least 2 * pointer_size + sizeof(T) + sizeof(bool). This may be 3 * pointer size depending on whether the tree contains a parent pointer for each tree node.

与此相比,散列的图:对于每个散列映射有将被浪费阵列空间由于这一事实,即负载因子其中; 1 如你所说。但是,假定哈希地图使用单链表的链接(真的,有没有真正的理由不),插入的每个元素只取的sizeof(T)+指针大小

Compare this to a hash-map: there will be wasted array space for each hash map due to the fact that load factor < 1 as you've said. However, assuming the hash map uses singly linked lists for chaining (and really, there's no real reason not to), each element inserted take only sizeof(T) + pointer size.

请注意,这种分析忽略任何开销,这可能来自使用对准额外的空间。

Note that this analysis ignores any overhead which may come from extra space used by alignment.

有关的任​​何元素 T 它具有体积小(所以,任何基本型),指针的大小和其他开销占主导地位。在&GT的负载系数; 0.5 (例如)的std :: unordered_set 可能确实占用更少的内存比同等的std ::设为

For any element T which has a small size (so, any basic type), the size of the pointers and other overhead dominates. At a load factor of > 0.5 (for example) the std::unordered_set may indeed use up less memory than the equivalent std::set.

另外一大缺失的一点是,通过迭代一个的std ::设为保证生产从小到大排序,根据给定的比较功能,而通过迭代的std :: unordered_set 将在随机顺序返回值。

The other big missing point is the fact that iterating through a std::set is guaranteed to produce an ordering from smallest to largest, based on the given comparison function, while iterating through an std::unordered_set will return the values in a "random" order.

这篇关于什么是C ++组和unordered_set之间的区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆