脾气暴躁,长数组问题 [英] Numpy, problem with long arrays

查看:78
本文介绍了脾气暴躁,长数组问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数组(a和b),其中n个整数元素的范围为(0,N).

I have two arrays (a and b) with n integer elements in the range (0,N).

typo:具有2 ^ n个整数的数组,其中最大整数取值N = 3 ^ n

我想计算a和b中所有元素组合的总和(对于所有 i,j ,sum_ij_ = a_i_ + b_j_).然后取模数N(sum_ij_ = sum_ij_%N),最后计算不同总和的频率.

I want to calculate the sum of every combination of elements in a and b (sum_ij_ = a_i_ + b_j_ for all i,j). Then take modulus N (sum_ij_ = sum_ij_ % N), and finally calculate the frequency of the different sums.

为了快速使用numpy进行此操作,而没有任何循环,我尝试使用meshgrid和bincount函数.

In order to do this fast with numpy, without any loops, I tried to use the meshgrid and the bincount function.

A,B = numpy.meshgrid(a,b)
A = A + B
A = A % N
A = numpy.reshape(A,A.size)
result = numpy.bincount(A)

现在,问题是我的输入数组很长.当我使用带有2 ^ 13个元素的输入时,meshgrid给了我MemoryError.我想对具有2 ^ 15-2 ^ 20个元素的数组进行计算.

Now, the problem is that my input arrays are long. And meshgrid gives me MemoryError when I use inputs with 2^13 elements. I would like to calculate this for arrays with 2^15-2^20 elements.

n为15到20范围内的

使用numpy可以做到这一点吗?

Is there any clever tricks to do this with numpy?

任何帮助将不胜感激.

- 乔恩

推荐答案

尝试将其分块.您的网状网格是一个NxN矩阵,最多可阻塞10x10 N/10xN/10,仅计算100个bin,最后将它们相加.这样做只占用整个工作的大约1%的内存.

try chunking it. your meshgrid is an NxN matrix, block that up to 10x10 N/10xN/10 and just compute 100 bins, add them up at the end. this only uses ~1% as much memory as doing the whole thing.

这篇关于脾气暴躁,长数组问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆