在C ++ std :: unordered_map中预分配存储桶 [英] Pre-allocating buckets in a C++ std::unordered_map

查看:183
本文介绍了在C ++ std :: unordered_map中预分配存储桶的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用gnu ++ 0x中的 std :: unordered_map 来存储大量数据.我想为大量元素预分配空间,因为我可以限制使用的总空间.

I'm using the std::unordered_map from gnu++0x to store a huge amount of data. I want to pre-allocate space for the large number of elements, since I can bound the total space used.

我想做的是打电话:

std::unordered_map m;
m.resize(pow(2,x));

其中x是已知的.

std :: unordered_map 不支持此功能.如果可能的话,我宁愿使用 std :: unordered_map ,因为它最终将成为标准的一部分.

std::unordered_map doesn't support this. I would rather use std::unordered_map if possible, since it will eventually be part of the standard.

其他一些限制:

需要可靠的O(1)访问和地图突变.所需的哈希和比较功能已经是非标准的,并且有些昂贵.O(log n)突变(与 std :: map 一样)太昂贵了.

Need reliable O(1) access and mutation of the map. The desired hash and comparison functions are already non-standard and somewhat expensive. O(log n) mutation (as with std::map) is too expensive.

->昂贵的哈希和比较也使基于摊销的增长方式变得过于昂贵.每个额外的插入操作都需要从这些函数进行O(n)运算,这会导致算法的运行时间增加一个二次项,因为指数存储需求需要O(n)增长.

-> The expensive hash and comparison also make amortization-based growth way too expensive. Each extra insert requires O(n) operations from those functions, which results in an extra quadratic term in the algorithm's run time, since the exponential storage requirements need O(n) growths.

推荐答案

m.rehash(pow(2,x));

如果 pow(2,x)是要预分配的存储桶数.您还可以:

if pow(2, x) is the number of buckets you want preallocated. You can also:

m.reserve(pow(2,x));

但现在 pow(2,x)是您计划插入的元素数.这两个函数除了预分配存储桶外什么都不做.他们不插入任何元素.而且它们都打算完全用于您的用例.

but now pow(2, x) is the number of elements you are planning on inserting. Both functions do nothing but preallocate buckets. They don't insert any elements. And they are both meant to be used exactly for your use case.

注意:您不能保证获得确切的 pow(2,x)存储桶.一些实现将仅使用数量为2的幂的存储桶,其他实现将仅使用素数的存储桶.还有一些将仅使用质数的子集作为存储桶的数量.但是无论如何,实现应在所需的存储桶数量处接受您的 hint ,然后在内部舍入到下一个可接受的存储桶数量.

Note: You aren't guaranteed to get exactly pow(2, x) buckets. Some implementations will use only a number of buckets which is a power of 2. Other implementations will use only a prime number of buckets. Still others will use only a subset of primes for the number of buckets. But in any case, the implementation should accept your hint at the number of buckets you desire, and then internally round up to its next acceptable number of buckets.

以下是最新的(N4660)用于指定 rehash 的参数的精确措辞:

Here is the precise wording that the latest (N4660) uses to specify the argument to rehash:

a.rehash(n):后置条件: a.bucket_count()> = a.size()/a.max_load_factor()和一个.bucket_count()> = n .

此后置条件确保 bucket()_ count()> = n ,并且 load_factor()保持小于或等于 max_load_factor().

This postcondition ensures that bucket()_count() >= n, and that load_factor() remains less than or equal to max_load_factor().

随后, reserve(n)是根据 rehash(n)定义的:

a.reserve(n):与 a.rehash(ceil(n/a.max_load_factor()))相同.

这篇关于在C ++ std :: unordered_map中预分配存储桶的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆