为什么std :: unordered_map速度慢,我可以更有效地使用它来缓解这种情况吗? [英] Why is std::unordered_map slow, and can I use it more effectively to alleviate that?

查看:584
本文介绍了为什么std :: unordered_map速度慢,我可以更有效地使用它来缓解这种情况吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近发现了一件奇怪的事情。似乎使用完全不缓存计算的Collat​​z序列长度是 faster 使用 std :: unordered_map 缓存所有元素

I’ve recently found out an odd thing. It seems that calculating Collatz sequence lengths with no caching at all is over 2 times faster than using std::unordered_map to cache all elements.

请注意,我确实从问题 gcc std :: unordered_map的执行速度是否较慢?如果是这样-为什么?,然后我尝试使用该知识使 std :: unordered_map 的性能达到最佳(我使用了g ++ 4.6,比最新版本的g ++更好,并且我尝试指定了一个合理的初始存储桶计数,我使其完全等于映射必须容纳的最大元素数。)

Note I did take hints from question Is gcc std::unordered_map implementation slow? If so - why? and I tried to used that knowledge to make std::unordered_map perform as well as I could (I used g++ 4.6, it did perform better than recent versions of g++, and I tried to specify a sound initial bucket count, I made it exactly equal to the maximum number of elements the map must hold).

相比之下,使用 std :: vector 缓存一些elements 比根本没有缓存快17倍,比使用 std :: unordered_map 快40倍。

In comparision, using std::vector to cache a few elements was almost 17 times faster than no caching at all and almost 40 times faster than using std::unordered_map.

我做错什么了吗,或者这个容器太慢了,为什么?可以使其表现更快吗?还是散列图本质上是无效的,应该在高性能代码中尽可能避免使用?

Am I doing something wrong or is this container THAT slow and why? Can it be made performing faster? Or maybe hashmaps are inherently ineffective and should be avoided whenever possible in high-performance code?

有问题的基准是:

#include <iostream>
#include <unordered_map>
#include <cstdint>
#include <ctime>

std::uint_fast16_t getCollatzLength(std::uint_fast64_t val) {
    static std::unordered_map <std::uint_fast64_t, std::uint_fast16_t> cache ({{1,1}}, 2168611);

    if(cache.count(val) == 0) {
        if(val%2 == 0)
            cache[val] = getCollatzLength(val/2) + 1;
        else
            cache[val] = getCollatzLength(3*val+1) + 1;
    }

    return cache[val];
}

int main()
{
    std::clock_t tStart = std::clock();

    std::uint_fast16_t largest = 0;
    for(int i = 1; i <= 999999; ++i) {
        auto cmax = getCollatzLength(i);
        if(cmax > largest)
            largest = cmax;
    }
    std::cout << largest << '\n';

    std::cout << "Time taken: " << (double)(std::clock() - tStart)/CLOCKS_PER_SEC << '\n';
}

输出:花费时间:0.761717

完全没有缓存的基准:

#include <iostream>
#include <unordered_map>
#include <cstdint>
#include <ctime>

std::uint_fast16_t getCollatzLength(std::uint_fast64_t val) {
    std::uint_fast16_t length = 1;
    while(val != 1) {
        if(val%2 == 0)
            val /= 2;
        else
            val = 3*val + 1;
        ++length;
    }
    return length;
}

int main()
{
    std::clock_t tStart = std::clock();

    std::uint_fast16_t largest = 0;
    for(int i = 1; i <= 999999; ++i) {
        auto cmax = getCollatzLength(i);
        if(cmax > largest)
            largest = cmax;
    }
    std::cout << largest << '\n';

    std::cout << "Time taken: " << (double)(std::clock() - tStart)/CLOCKS_PER_SEC << '\n';
}

输出花费的时间:0.324586

推荐答案

事实上,标准库的映射速度很慢( std :: map ,但 std :: unoredered_map 及其链接列表存储区也是如此)。 Google的钱德勒·卡鲁斯(Chandler Carruth)在他的 CppCon 2014演讲中对此进行了解释;简而言之: std :: unordered_map 是缓存不友好的,因为它使用链接列表作为存储桶。

The standard library's maps are, indeed, inherently slow (std::map especially but std::unoredered_map as well, with its linked-list bucket). Google's Chandler Carruth explains this in his CppCon 2014 talk; in a nutshell: std::unordered_map is cache-unfriendly because it uses linked lists as buckets.

这个SO问题提到了一些有效的哈希映射实现,请改用其中一种。

This SO question mentioned some efficient hash map implementations - use one of those instead.

这篇关于为什么std :: unordered_map速度慢,我可以更有效地使用它来缓解这种情况吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆