在C ++中分配大内存块 [英] Allocating a large memory block in C++

查看:91
本文介绍了在C ++中分配大内存块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试为C ++浮点值的3D矩阵分配一个大的存储块.它的尺寸是44100x2200x2.这应该恰好占用44100x2200x2x4字节的内存,大约为7.7gb.我在带有Ubuntu的64位x86机器上使用g ++编译我的代码.当我使用htop查看进程时,我看到内存使用量增加到32gb,并立即被杀死.我的记忆计算有误吗?

I am trying to allocate a large memory block for a 3D matrix in C++ of floating point value. It's dimensions are 44100x2200x2. This should take exactly 44100x2200x2x4 bytes of memory which is about 7.7gb. I am compiling my code using g++ on a 64bit x86 machine with Ubuntu. When I view the process using htop, I see that the memory usage grows to 32gb and is promptly killed. Did I make a mistake in my memory calculation?

这是我的代码:

#include <iostream>

using namespace std;
int main(int argc, char* argv[]) {
  int N = 22000;
  int M = 44100;
  float*** a = new float**[N];
  for (int m = 0; m<N; m+=1) {
    cout<<((float)m/(float)N)<<endl;
    a[m] = new float*[M - 1];
    for (int n = 0; n<M - 1; n+=1) {
      a[m][n] = new float[2];
    }
  }
}

我的计算不正确,我的分配接近38gb.我现在修复了分配15gb的代码.

My calculation was incorrect, and I was allocating closer to 38gb. I fixed the code now to allocate 15gb.

#include <iostream>

using namespace std;
int main(int argc, char* argv[]) {
  unsigned long  N = 22000;
  unsigned long  M = 44100;
  unsigned long blk_dim = N*(M-1)*2;
  float* blk = new float[blk_dim];
  unsigned long b = (unsigned long) blk;

  float*** a = new float**[N];
  for (int m = 0; m<N; m+=1) {
    unsigned long offset1 = m*(M - 1)*2*sizeof(float);
    a[m] = new float*[M - 1];
    for (int n = 0; n<M - 1; n+=1) {
      unsigned long offset2 = n*2*sizeof(float);
      a[m][n] = (float*)(offset1 + offset2 + b);
    }
  }
}

推荐答案

您忘记了一个维度以及分配内存的开销.所显示的代码在三维空间中分配内存的效率非常低,从而导致过多的开销.

You forgot one dimension, and the overhead of allocating memory. The shown code allocates memory very inefficiently in the third dimension, resulting in way too much overhead.

float*** a = new float**[N];

这将分配大约22000 * sizeof(float **),大约为176kb.可以忽略不计.

This will allocate, roughly 22000 * sizeof(float **), which is rougly 176kb. Negligible.

a[m] = new float*[M - 1];

这里的单个分配将用于44099 * sizeof(float *),但是您将获得其中的22000. 22000 * 44099 * sizeof(float *),或大约7.7gb的额外内存.这是您停止计数的地方,但是您的代码尚未完成.还有很长的路要走.

A single allocation here will be for 44099 * sizeof(float *), but you will grab 22000 of these. 22000 * 44099 * sizeof(float *), or roughly 7.7gb of additional memory. This is where you stopped counting, but your code isn't done yet. It's got a long ways to go.

a[m][n] = new float[2];

这是8个字节的单个分配,但是此分配将完成22000 * 44099次.那是 另一个 7.7gb耗尽了水.现在,大约需要分配15个应用程序所需的内存.

This is a single allocation of 8 bytes, but this allocation will be done 22000 * 44099 times. That's another 7.7gb flushed down the drain. You're now over 15 gigs of application-required memory, roughly, that needs to be allocated.

但是每个分配都不免费,并且new float[2]需要的更多超过8个字节.每个单独分配的块必须由C ++库在内部进行跟踪,以便可以由delete回收.最简单的基于堆列表的链接列表实现需要一个前向指针,一个后向指针以及分配的块中有多少字节的计数.假设不需要为对齐目的而填充任何内容,那么在64位平台上,每个分配至少要占用24字节的开销.

But each allocation does not come free, and new float[2] requires more than 8 bytes. Each individually allocated block must be tracked internally by your C++ library, so that it can be recycled by delete. The most simplistic link-list based implementation of heap allocation requires one forward pointer, one backward pointer, and the count of how many bytes are there in the allocated block. Assuming nothing needs to be padded for alignment purposes, this is at least 24 bytes of overhead per allocation, on a 64-bit platform.

现在,由于您的第三个维度进行了22000 * 44099的分配,第二个维度进行了22000的分配,而第一个维度进行了一个分配:如果我用手指指望,这将需要(22000 * 44099 + 22000 +1)* 24或另外22 GB的内存,只是为了消耗最简单的基本内存分配方案的开销.

Now, since your third dimension makes 22000 * 44099 allocations, 22000 allocations for the second dimension, and one allocation for the first dimension: if I count on my fingers, this will require (22000 * 44099 + 22000 + 1) * 24, or another 22 gigabytes of memory, just to consume the overhead of the most simple, basic memory allocation scheme.

如果我正确地进行了数学运算,那么使用最简单,可能的堆分配跟踪功能,现在我们需要多达38 GB的RAM.您的C ++实现可能会使用稍微复杂一些的堆分配逻辑,但开销会更大.

We're now up to about 38 gigabytes of RAM needed using the most simple, possible, heap allocation tracking, if I did my math right. Your C++ implementation is likely to use a slightly more sophisticated heap allocation logic, with larger overhead.

摆脱new float[2].计算矩阵的大小,并new单个7.7gb块,然后计算其余指针应指向的位置.另外,为矩阵的第二个维度分配一个内存块,并为第一个维度计算指针.

Get rid of the new float[2]. Compute your matrix's size, and new a single 7.7gb chunk, then calculate where the rest of your pointers should be pointing to. Also, allocate a single chunk of memory for the second dimension of your matrix, and compute the pointers for the first dimension.

您的分配代码应恰好执行三个new语句.一个用于第一维指针,一个用于第二维指针.还有涉及构成您的第三维的大量数据的另一种方法.

Your allocation code should execute exactly three new statements. One for the first dimension pointer, One for the second dimension pointers. And one more for the huge chunk of data that comprises your third dimension.

这篇关于在C ++中分配大内存块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆