使用巨大的向量冻结C ++程序 [英] Freeze in C++ program using huge vector

查看:92
本文介绍了使用巨大的向量冻结C ++程序的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对C ++程序有疑问.我认为这是一个记忆问题. 在我的程序中,我习惯于创建一些巨大的std :: vector(我使用reserve来分配一些内存).使用1000000的矢量大小,可以,但是如果我增加这个数字(大约一千万),我的程序将冻结我的PC,除了等待崩溃(或者如果幸运的话,该程序结束)时,我什么也不能做.我的向量包含一个称为Point的结构,该结构包含double的向量.

I have an issue with a C++ program. I think it's a problem of memory. In my program i'm used to create some enormous std::vector (i use reserve to allocate some memory). With vector size of 1 000 000, it's ok but if i increase this number (about ten millions), my program will freeze my PC and i can do nothing except waiting for a crash (or end of the program if i'm lucky). My vector contains a structure called Point which contain a vector of double.

我用valgrind检查内存是否不足.但不是.根据它,没有问题.也许不建议使用对象向量?还是有一些系统参数需要检查?还是简单地说,向量对计算机来说太大了?

I used valgrind to check if there is a memory lack. But no. According to it, there is no problem. Maybe using a vector of objects is not advised ? Or maybe is there some system parameters to check or something ? Or simply, the vector is too big for the computer ?

您对此有何看法?

推荐答案

免责声明

请注意,此答案假设您的机器存在一些问题;确切的内存使用情况和潜在的错误取决于您的环境.当然,当您不基于2d点进行计算时,崩溃甚至更加容易,例如例如计算机图形学中常见的4d点,或者出于其他数字目的甚至更大的点.

Disclaimer

Note that this answer assumes a few things about your machine; the exact memory usage and error potential depends on your environment. And of course it is even easier to crash when you don't compute on 2d-Points, but e.g. 4d-points, which are common in computer graphics for example, or even larger Points for other numeric purposes.

要分配的内存很多:

#include <iostream>
#include <vector>
struct Point {
    std::vector<double> coords;
};
int main () {
    std::cout << sizeof(Point) << std::endl;
}

这将打印12,它是空的Point的大小(以字节为单位).如果您有二维点,请为每个元素添加另一个2*sizeof(double)=8,即每个Point现在总共有20个字节.

This prints 12, which is the size in bytes of an empty Point. If you have 2-dimensional points, add another 2*sizeof(double)=8 to that per element, i.e. you now have a total of 20 bytes per Point.

拥有数以千万计的元素,您需要200百万个字节的数据,例如对于2000万个元素,您需要4亿个字节.尽管这不超过std::vector的最大索引,但是操作系统可能没有太多的连续内存供您使用.

With 10s of millions of elements, you request 200s of millions of bytes of data, e.g. for 20 million elements, you request 400 million bytes. While this does not exceed the maximum index into an std::vector, it is possible that the OS does not have that much contiguous memory free for you.

此外,vector的内存需要被频繁复制才能增长.例如,当您push_back时会发生这种情况,因此,当您已经拥有400MiB vector时,在下一个push_back上您可能具有旧版本的vector,以及新分配的400MiB * X内存,因此您可能会轻易超过temporarilly 1000MiB等.

Also, your vectors memory needs to be copied quite often in order to be able to grow. This happens for example when you push_back, so when you already have a 400MiB vector, upon the next push_back you might have your old version of the vector, plus the newly allocated 400MiB*X memory, so you may easily exceed the 1000MiB temporarilly, et cetera.

您是否需要一直真正地存储数据?您可以使用不需要太多存储空间的类似算法吗?您可以重构代码以减少存储量吗?当您知道需要一段时间才能再次使用它们时,可以将一些数据移出核心吗?

Do you need to actually store the data all time? Can you use a similar algorithm which does not require so much storage? Can you refactor your code so that storage is reduced? Can you core some data out when you know it will take some time until you need it again?

如果在创建外部向量之前知道元素的数量,请使用std::vector构造函数,您可以知道初始大小:

If you know the number of elements before creating your outer vector, use the std::vector constructor which you can tell an initial size:

vector<Foo> foo(12) // initialize have 12 elements

当然,您可以优化很多内存;例如如果您知道总是只有2d点,则只需两个double作为成员:20字节-> 16个字节.当您真的不需要double的精度时,请使用float:16个字节->. 8个字节.这是对$ 2/5 $的优化:

Of course you can optimize a lot for memory; e.g. if you know you always only have 2d-Points, just have two doubles as members: 20 bytes -> 16 bytes. When you do not really need the precision of double, use float: 16 bytes -> 8 bytes. That's an optimization to $2/5$:

// struct Point { std::vector<double> coords; };   <-- old
struct Point { float x, y; }; // <-- new

如果这还不够,则临时解决方案可以是std::deque,或者是另一个非连续的容器:因为不需要调整大小;也不需要操作系统找到您这样的连续内存块.

If this is still not enough, an ad-hoc solution could be std::deque, or another, non-contiguous container: No temporal memory "doubling" because no resizing needed; also no need for the OS to find you such contiguous block of memory.

您还可以使用压缩机制,索引数据或定点数.但这取决于您的实际情况.

You can also use compression mechanisms, or indexed data, or fixed point numbers. But it depends on your exact circumstances.

struct Point { signed char x, y; }; // <-- or even this? examine a proper type
struct Point { short x_index, y_index; };

这篇关于使用巨大的向量冻结C ++程序的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆