numpy中的巨大数组的点积 [英] Dot product of huge arrays in numpy

查看:160
本文介绍了numpy中的巨大数组的点积的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大数组,我想用一个小数组计算点积.但是我收到的数组太大"可以解决吗?

I have a huge array and I want to calculate dot product with a small array. But I am getting 'array is too big' Is there a work around?

import numpy as np

eMatrix = np.random.random_integers(low=0,high=100,size=(20000000,50))
pMatrix = np.random.random_integers(low=0,high=10,size=(50,50))

a = np.dot(eMatrix,pMatrix)

Error:
/Library/Python/2.7/site-packages/numpy/random/mtrand.so in mtrand.RandomState.random_integers (numpy/random/mtrand/mtrand.c:9385)()

/Library/Python/2.7/site-packages/numpy/random/mtrand.so in mtrand.RandomState.randint (numpy/random/mtrand/mtrand.c:7051)()

ValueError: array is too big.

推荐答案

计算数组的总大小时,如果它溢出了本地int类型,则会引发该错误,

That error is raised when figuring the total size of the array, if it overflows the native int type, see here for the exact source code line.

要做到这一点,无论您的计算机是64位的,几乎可以肯定您正在运行32位版本的Python(和NumPy).

For this to happen, regardless of your machine being 64 bits, you are almost certainly running 32 bit versions of Python (and NumPy). You can check if that is the case by doing:

>>> import sys
>>> sys.maxsize
2147483647 # <--- 2**31 - 1, on a 64 bit version you would get 2**63 - 1

然后,您的数组是仅" 20000000 * 50 = 1000000000,位于2**30下.如果我尝试在32位numpy上重现您的结果,则会得到MemoryError:

Then again, you array is "only" 20000000 * 50 = 1000000000, which is just under 2**30. If I try to reproduce your results on a 32-bit numpy, I get a MemoryError:

>>> np.random.random_integers(low=0,high=100,size=(20000000,50))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mtrand.pyx", line 1420, in mtrand.RandomState.random_integers (numpy\random\mtrand\mtrand.c:12943)
  File "mtrand.pyx", line 938, in mtrand.RandomState.randint (numpy\random\mtrand\mtrand.c:10338)
MemoryError

除非我将尺寸增加到超出魔法2**31 - 1阈值

unless I increase the size beyond the magic 2**31 - 1 threshold

>>> np.random.random_integers(low=0,high=100,size=(2**30, 2))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "mtrand.pyx", line 1420, in mtrand.RandomState.random_integers (numpy\random\mtrand\mtrand.c:12943)
  File "mtrand.pyx", line 938, in mtrand.RandomState.randint (numpy\random\mtrand\mtrand.c:10338)
ValueError: array is too big.

鉴于回溯和我的行号不同,我怀疑您使用的是旧版本.这在您的系统上输出什么:

Given the difference in the line numbers in your traceback and mine, I suspect you are using an older version. What does this output on your system:

>>> np.__version__
'1.10.0.dev-9c50f98'

这篇关于numpy中的巨大数组的点积的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆