计算无序映射占用的内存空间 [英] Calculating memory space occupied by unordered maps

查看:118
本文介绍了计算无序映射占用的内存空间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个无序的地图:(代码在linux中执行)



第一个无序映射



它包含至少65536个条目。每个条目包含

  int 
unsigned char
unsigned char

第二个无序图:



比65536例。每个条目由

  int 
int
int
向量< char>

现在我想根据上面两个无序映射占用的内存进行比较(以字节为单位)。之后,我想计算内存压缩。
请指导我如何找到两个无序映射占用的内存?



第二个无序映射的更多细节:

  typedef std :: tuple< int,int& key_t; 

struct KeyHasher
{
std :: size_t operator()(const key_t& k)const
{
using boost :: hash_value;
using boost :: hash_combine;

//以哈希值0开始。
std :: size_t seed = 0;

//通过在
中进行异或和位移来修改'seed'//另一个之后是Key的一个成员:
hash_combine(seed,hash_value(std :: get(0)(k)));
hash_combine(seed,hash_value(std :: get< 1(k))));

//返回结果。
return seed;
}
};

struct Ndata
{
int value;
矢量< char>接受
};

typedef boost :: unordered_map< const key_t,Ndata,KeyHasher>二级图;
}


解决方案

可以准确地回答你的问题,而不需要查看STL使用的精确 unordered_map 实现。



unordered_map 界面,您可以做出有教育意义的猜测:



unordered_map 需要存储:




  • 容器(可能是向量样结构)


  • max_bucket_count 桶(可能是单链接列表式结构)





快速查看Libc ++实现后,还需要存储空间:




  • 散列函数对象



  • b $ b



考虑到这一点,我的guesstimate会是这样:

  typedef unordered_map< K,V,...> tMyMap; 

size_t getMemoryUsage(const tMyMap& map){
auto entrySize = sizeof(K)+ sizeof(V)+ sizeof(void *);
auto bucketSize = sizeof(void *);
auto adminSize = 3 * sizeof(void *)+ sizeof(size_t);

auto totalSize = adminSize + map.size()* entrySize + map.max_bucket_count()* bucketSize();
return totalSize;
}

这只适用于第一种情况,因为在第二种情况下,每个条目可以具有基于每个向量有多大的完全不同的存储器使用。对于第二种情况,您必须添加如下:

  size_t getMemoryUsage(const tMyMap& map){
auto entrySize = sizeof(K)+ sizeof(V)+ sizeof(void *);
auto bucketSize = sizeof(void *);
auto adminSize = 3 * sizeof(void *)+ sizeof(size_t);
auto totalSize = adminSize + map.size()* entrySize + map.max_bucket_count()* bucketSize();

auto contentSize = 0;
for(const auto& kv:map){
//因为accept是一个向量< char>,
//它使用额外内存的capacity()字节
contentSize + = kv.second.accept.capacity();
}
totalSize + = contentSize;

return totalSize;但是,考虑到真实世界的分配逻辑,你的地图实际使用的内存可能会被记录在内存中。与此有很大不同外部碎片。如果你想100%确保unordered_map使用多少内存,你还需要考虑分配器行为。


I have two unordered maps: (code is executed in linux)

First unordered map:

It consists of more atleast 65536 entries. Each entry consists of

int
unsigned char
unsigned char

Second unordered map:

It consists of less than 65536 enteries. Each entry consist of

int
int
int
vector <char>

Now I want to make comparison of the two on the basis of memory occupied by the above two unordered maps (in bytes). After that I want to calculate memory compression achieved. Please guide me how can I find the memory occupied by the two unordered maps?

More details of the second unordered map:

typedef std::tuple<int, int> key_t;

struct KeyHasher
{
  std::size_t operator()(const key_t& k) const
  {
      using boost::hash_value;
      using boost::hash_combine;

      // Start with a hash value of 0    .
      std::size_t seed = 0;

      // Modify 'seed' by XORing and bit-shifting in
      // one member of 'Key' after the other:
      hash_combine(seed,hash_value(std::get<0>(k)));
      hash_combine(seed,hash_value(std::get<1>(k)));

      // Return the result.
      return seed;
  }
};

struct Ndata
{
int value;
vector<char> accept ;
};

typedef boost::unordered_map<const key_t,Ndata,KeyHasher> SecondMap;
}

解决方案

I don't think it's possible to answer your question precisely without looking at the precise unordered_map implementation your STL uses.

However, based on the unordered_map interface, you could make decent educated guesses:

An unordered_map needs to store:

  • one bucket container (probably a vector-like structure)

  • max_bucket_count buckets (probably singly-linked-list-like structures)

  • one complete entry for each item (not only the value, but also the key, to handle key hash collisions)

After a quick look at the Libc++ implementation, you also need space to store:

  • The hashing function object

  • The equality test function object

  • The allocator function object

With this in mind, my guesstimate would be something like:

typedef unordered_map<K, V, ...> tMyMap;

size_t getMemoryUsage(const tMyMap& map) {
  auto entrySize = sizeof(K) + sizeof(V) + sizeof(void*);
  auto bucketSize = sizeof(void*);
  auto adminSize = 3 * sizeof(void*) + sizeof(size_t);

  auto totalSize = adminSize + map.size() * entrySize + map.max_bucket_count() * bucketSize();
  return totalSize;
}

This would only work for the first of your cases, since in the second cases, each entry can have a completely different memory usage based on how big each vector is. For the second case, you would therefore have to add something like this:

size_t getMemoryUsage(const tMyMap& map) {
  auto entrySize = sizeof(K) + sizeof(V) + sizeof(void*);
  auto bucketSize = sizeof(void*);
  auto adminSize = 3 * sizeof(void*) + sizeof(size_t);
  auto totalSize = adminSize + map.size() * entrySize + map.max_bucket_count() * bucketSize();

  auto contentSize = 0;
  for (const auto& kv : map) {
    // since accept is a vector<char>, 
    // it uses capacity() bytes of additional memory
    contentSize += kv.second.accept.capacity();
  }
  totalSize += contentSize;

  return totalSize;
}

However, considering real-world allocation logic, the memory your map actually uses may differ a lot from this due to e.g. external fragmentation. If you want to be 100% sure how much memory an unordered_map uses, you need to also take allocator behavior into account.

这篇关于计算无序映射占用的内存空间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆