连接两个大的numpy 2D数组 [英] Concatenate two big numpy 2D arrays

查看:74
本文介绍了连接两个大的numpy 2D数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个大的numpy 2D数组.一种形状是X1(1877055,1299),另一种形状是X2(1877055,1445).然后,我使用

I have two big numpy 2D arrays. One shape is X1 (1877055, 1299), another is X2 (1877055, 1445). I then use

X = np.hstack((X1, X2))

将两个数组连接成一个更大的数组.但是,该程序不会运行并以代码-9退出.它没有显示任何错误消息.

to concatenate the two arrays into a bigger array. However, the program doesn't run and exit with code -9. It didn't show any error message.

出什么问题了?如何串联两个大的numpy 2D数组?

What is the problem? How can I concatenate such two big numpy 2D arrays?

推荐答案

除非您的NumPy版本或操作系统有问题(两者均不太可能),但这几乎肯定是内存错误.

Unless there's something wrong with your NumPy build or your OS (both of which are unlikely), this is almost certainly a memory error.

例如,假设所有这些值均为float64.因此,您已经为这两个阵列分配了至少18GB和20GB,现在您正尝试为级联阵列分配另外38GB.但是,例如,您只有64GB的RAM和2GB的交换空间.因此,没有足够的空间来分配另一个38GB.在某些平台上,此分配将仅会失败,希望NumPy会捕获并引发MemoryError.在其他平台上,分配可能会成功,但是一旦您尝试实际触摸所有该内存,就会出现段错误(请参阅

For example, let's say all these values are float64. So, you've already allocated at least 18GB and 20GB for these two arrays, and now you're trying to allocate another 38GB for the concatenated array. But you only have, say, 64GB of RAM plus 2GB of swap. So, there's not enough room to allocate another 38GB. On some platforms, this allocation will just fail, which hopefully NumPy would just catch and raise a MemoryError. On other platforms, the allocation may succeed, but as soon as you try to actually touch all of that memory you'll segfault (see overcommit handling in linux for an example). On other platforms, the system will try to auto-expand swap, but then if you're out of disk space it'll segfault.

无论什么原因,如果您不能同时将X1X2X装入内存,您可以怎么做?

Whatever the reason, if you can't fit X1, X2, and X into memory at the same time, what can you do instead?

  • 首先构建X,然后通过填充X的切片视图来填充X1X2.
  • X1X2写入磁盘,在磁盘上串联,然后将它们读回.
  • X1X2发送到一个子进程,该子进程迭代地读取它们并构建X,然后继续工作.
  • Just build X in the first place, and fill X1 and X2 by filling sliced views of X.
  • Write X1 and X2 out to disk, concatenate on disk, and read them back in.
  • Send X1 and X2 to a subprocess that reads them iteratively and builds X and then continues the work.

这篇关于连接两个大的numpy 2D数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆