Unordered_set问题 [英] Unordered_set questions

查看:209
本文介绍了Unordered_set问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人能解释一个无序的集合如何工作吗?我也不知道一套如何工作。我的主要问题是它的find函数的效率。

Could anyone explain how an unordered set works? I am also not sure how a set works. My main question is what is the efficiency of its find function.

例如,这个的总O运行时间是多少?

For example, what is the total big O run time of this?

    vector<int> theFirst;
    vector<int> theSecond;
    vector<int> theMatch;

    theFirst.push_back( -2147483648 );
    theFirst.push_back(2);
    theFirst.push_back(44);


    theSecond.push_back(2);
    theSecond.push_back( -2147483648 );
    theSecond.push_back( 33 );


    //1) Place the contents into a unordered set that is O(m). 
    //2) O(n) look up so thats O(m + n). 
    //3) Add them to third structure so that's O(t)
    //4) All together it becomes O(m + n + t)
    unordered_set<int> theUnorderedSet(theFirst.begin(), theFirst.end());

    for(int i = 0; i < theSecond.size(); i++) 
    {
        if(theUnorderedSet.find(theSecond[i]) != theUnorderedSet.end()) 
        {
        theMatch.push_back( theSecond[i] );
        cout << theSecond[i];
        }
   }


推荐答案

code> unordered_set ,所有其他无序_ 数据结构使用散列,如@Sean所述。散列涉及用于插入的摊销常数时间,并且接近用于查找的恒定时间。哈希函数本质上需要一些信息并从中产生一个数字。在相同的输入必须产生相同的输出的意义上,它是一个函数。然而,不同的输入可以导致相同的输出,导致所谓的碰撞。查找将被保证为完美散列函数的恒定时间,即,没有碰撞的函数。在实践中,输入数字来自您存储在结构中的元素(假设它的值,它是一个原始类型),并将其映射到数据结构中的位置。因此,对于给定的键,该函数将您带到存储元素的位置,而不需要任何遍历或搜索(为了简单,在此忽略碰撞),因此是恒定的时间。这些结构有不同的实现(开放寻址,链接等)。请参见散列表, a href =http://en.wikipedia.org/wiki/Hash_function =nofollow>散列函数。我还建议Skiena的算法设计手册的第3.7节。现在,关于大O复杂性,你是正确的,你有O(n)+ O(n)+ O(重叠的大小)。由于重叠不能大于m和n中的较小者,所以总体复杂度可以表示为O(kN),其中N是m和n之间的最大值。所以,O(N)。再次,这是最好的情况,没有冲突,并具有完美的散列。

unordered_set and all the other unordered_ data structures use hashing, as mentioned by @Sean. Hashing involves amortized constant time for insertion, and close to constant time for lookup. A hash function essentially takes some information and produces a number from it. It is a function in the sense that the same input has to produce the same output. However, different inputs can result in the same output, resulting in what is termed a collision. Lookup would be guaranteed to be constant time for an "perfect hash function", that is, one with no collisions. In practice, the input number comes from the element you store in the structure (say it's value, it is a primitive type) and maps it to a location in a data structure. Hence, for a given key, the function takes you to the place where the element is stored without need for any traversals or searches (ignoring collisions here for simplicity), hence constant time. There are different implementations of these structures (open addressing, chaining, etc.) See hash table, hash function. I also recommend section 3.7 of The Algorithm Design Manual by Skiena. Now, concerning big-O complexity, you are right that you have O(n) + O(n) + O(size of overlap). Since the overlap cannot be bigger than the smaller of m and n, the overall complexity can be expressed as O(kN), where N is the largest between m and n. So, O(N). Again, this is "best case", without collisions, and with perfect hashing.

设置 multi_set 另一方面使用二叉树,因此插入和查找通常是O(logN)。散列结构与二叉树的实际性能将取决于N,因此最好尝试这两种方法,并在现实的运行场景中对它们进行分类。

set and multi_set on the other hand use binary trees, so insertions and look-ups are typically O(logN). The actual performance of a hashed structure vs. a binary tree one will depend on N, so it is best to try the two approaches and profile them in a realistic running scenario.

这篇关于Unordered_set问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆