map和unordered_map在c ++中的性能差异 [英] Difference in performance between map and unordered_map in c++

查看:168
本文介绍了map和unordered_map在c ++中的性能差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个简单的要求,我需要一个类型的地图。然而,我需要最快的理论上可能的检索时间。



我使用了地图和新的提出的unordered_map从tr1
i发现,至少在解析文件和创建通过在时间插入一个元素来映射。



只需2分钟,而unordered_map需要5分钟。



因为我将成为要在Hadoop集群上执行的代码将包含约1亿条记录,我需要最小的检索时间。



另外一个有用的信息:
当前的数据(键)正在插入的整数范围是从1,2到... 1000万。



我也可以强加用户指定最大值和使用顺序如上,会显着影响我的实现吗? (我听说地图是基于rb树和插入增加的顺序导致更好的性能(或最坏的?))



这里是代码

  map< int,int>标签//这被更改为unordered_map 
fstream LabelFile(Labels.txt);


//从Label.txt创建地图
if(LabelFile.is_open())
{
while(!LabelFile.eof ))
{
getline(LabelFile,inputLine);
try
{
curnode = inputLine.substr(0,inputLine.find_first_of(\t));
nodelabel = inputLine.substr(inputLine.find_first_of(\t)+ 1,inputLine.size() - 1);
标签[atoi(curnode.c_str())] = atoi(nodelabel.c_str());
}
catch(char * strerr)
{
failed = true;
break;
}
}
LabelFile.close();
}

暂定解决方案:回顾评论和答案后,我相信一个Dynamic C ++数组将是最好的选择,因为实现将使用密钥。感谢

解决方案

unordered_map的插入应该是 O(1) 检索应大致为 O(1) ,(其本质上是一个哈希表)。



您的时间结果是 OFF ,或者您的实现或使用unordered_map有一些错误



您需要提供一些更多信息,也可能提供如何使用该容器。



根据部分给出了n1836的6.3的插入/恢复的复杂性:





您应该考虑的一个问题是,您的实施可能需要持续 rehashing 结构,就像你说的那样,你有100mil +项目。在实例化容器的情况下,如果您对容器中插入多少个唯一元素有一个粗略的了解,那么可以将其作为参数传递给构造函数,容器将相应地使用适当大小的桶表实例化。


I have a simple requirement, i need a map of type . however i need fastest theoretically possible retrieval time.

i used both map and the new proposed unordered_map from tr1 i found that at least while parsing a file and creating the map, by inserting an element at at time.

map took only 2 minutes while unordered_map took 5 mins.

As i it is going to be part of a code to be executed on Hadoop cluster and will contain ~100 million entries, i need smallest possible retrieval time.

Also another helpful information: currently the data (keys) which is being inserted is range of integers from 1,2,... to ~10 million.

I can also impose user to specify max value and to use order as above, will that significantly effect my implementation? (i heard map is based on rb trees and inserting in increasing order leads to better performance (or worst?) )

here is the code

map<int,int> Label // this is being changed to unordered_map  
fstream LabelFile("Labels.txt");  


// Creating the map from the Label.txt  
if (LabelFile.is_open())  
{  
    while (! LabelFile.eof() )  
    {             
        getline (LabelFile,inputLine);  
        try  
        {  
            curnode=inputLine.substr(0,inputLine.find_first_of("\t"));  
            nodelabel=inputLine.substr(inputLine.find_first_of("\t")+1,inputLine.size()-1);  
            Label[atoi(curnode.c_str())]=atoi(nodelabel.c_str());  
        }  
        catch(char* strerr)  
        {  
            failed=true;  
            break;  
        }  
    }  
    LabelFile.close(); 
}

Tentative Solution: After review of comments and answers, i believe a Dynamic C++ array would be the best option, since the implementation will use dense keys. Thanks

解决方案

Insertion for unordered_map should be O(1) and retrieval should be roughly O(1), (its essentially a hash-table).

Your timings as a result are way OFF, or there is something WRONG with your implementation or usage of unordered_map.

You need to provide some more information, and possibly how you are using the container.

As per section 6.3 of n1836 the complexities for insertion/retreival are given:

One issue you should consider is that your implementation may need to continually be rehashing the structure, as you say you have 100mil+ items. In that case when instantiating the container, if you have a rough idea about how many "unique" elements will be inserted into the container, you can pass that in as a parameter to the constructor and the container will be instantiated accordingly with a bucket-table of appropriate size.

这篇关于map和unordered_map在c ++中的性能差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆