Scipy径向基函数(scipy.interpolate.rbf)中的Python MemoryError [英] Python MemoryError in Scipy Radial Basis Function (scipy.interpolate.rbf)

查看:571
本文介绍了Scipy径向基函数(scipy.interpolate.rbf)中的Python MemoryError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用Scipy径向基函数(Rbf)对表示二维表面的不那么大(约10.000个样本)的点云进行插值.我得到了一些不错的结果,但是在最后一个数据集中,我始终得到MemoryError,即使错误在执行过程中几乎立即出现(RAM显然没有被消耗掉).

I'm trying to interpolate a not-so-large (~10.000 samples) pointcloud representing a 2D surface, using Scipy Radial Basis Function (Rbf). I got some good results, but with my last datasets I'm consistently getting MemoryError, even though the error appears almost instantly during execution (the RAM is obviously not being eaten up).

我决定从Scipy中破解rbf.py文件的副本,首先用一些打印语句填充它,这非常有用.通过逐行分解_euclidean_norm方法,如下所示:

I decided to hack a copy of the rbf.py file from Scipy, starting by filling it up with some print statements, which have been very useful. By decomposing the _euclidean_norm method line by line, like this:

def _euclidean_norm(self, x1, x2):
    d = x1 - x2
    s = d**2
    su = s.sum(axis=0)
    sq = sqrt(su)
    return sq

第一行出现错误:

File "C:\MyRBF.py", line 68, in _euclidean_norm
    d = x1 - x2
MemoryError

以[[x1,y1],[x2,y2],[x3,y3],...,[xn,yn]]和X2的形式在数组X1上调用该范数由Rbf类中的以下方法转置,已经由我出于调试目的而入侵:

That norm is called upon an array X1 in the form [[x1, y1], [x2, y2], [x3, y3], ..., [xn, yn]], and X2, which is X1 transposed by the following method inside Rbf class, already hacked by me with debugging purposes:

def _call_norm(self, x1, x2):
    print x1.shape
    print x2.shape
    print

    if len(x1.shape) == 1:
        x1 = x1[newaxis, :]
    if len(x2.shape) == 1:
        x2 = x2[newaxis, :]
    x1 = x1[..., :, newaxis]
    x2 = x2[..., newaxis, :]

    print x1.shape
    print x2.shape
    print

    return self._euclidean_norm(x1, x2)

请注意,我打印输入的形状.使用我当前的数据集,这就是我所得到的(我手动添加了评论):

Please notice that I print the shapes of inputs. With my current dataset, that's what I get (I added the comments manually):

(2, 10744)         ## Input array of 10744 x,y pairs
(2, 10744)         ## The same array, which is to be "reshaped/transposed"

(2, 10744, 1)      ## The first "reshaped/transposed" form of the array
(2, 1, 10744)      ## The second "reshaped/transposed" form of the array

根据文档,其基本原理是获得从x1中的每个点到x2中的每个点的距离的矩阵",这意味着,由于数组相同,因此每个对之间的距离矩阵条目数组(包含X和Y维度).

The rationale is, according to documentation, to get "a matrix of the distances from each point in x1 to each point in x2", which mean, since the arrays are the same, a matrix of distances between every pair of the entry array (which contains the X and Y dimensions).

我用较小的数组(例如形状(2,5,1)和(2,1,5))手动测试了该操作,然后进行了减法运算.

I tested the operation manually with much smaller arrays (shapes (2,5,1) and (2,1,5), for example) and the subtraction works.

我如何找出为什么它不能与我的数据集一起使用?还有其他明显的错误吗?我应该检查数据集是否存在某种形式的疾病,或者对其进行一些预处理?我认为它条件良好,因为我可以用3D绘制它,并且浊点在视觉上形成得很好.

How can I find out why it is not working with my dataset? Is there any other obvious error? Should I check some form of ill-conditioning of my dataset, or perform some pre-processing on it? I think it is well-conditioned, since I can plot it in 3D and the cloudpoint is visually very well formed.

任何帮助将不胜感激.

感谢阅读.

推荐答案

您的数据集应该没问题:出现此错误是因为您没有足够的RAM来存储减法结果.

Your dataset should be fine: the error appears because you don't have enough RAM to store the result of the subtraction.

根据广播规则,结果将具有形状

According to the broadcasting rules, the result will have shape

 (2, 10744,     1)
-(2,     1, 10744)
------------------
 (2, 10744, 10744)

假设这些是dtype float64的数组,则需要2 * 10744 ** 2 * 8 = 1.72 GiB的可用内存.如果没有足够的可用内存,则numpy将无法分配输出数组,并且将立即失败,并显示错误消息.

Assuming these are arrays of dtype float64, you need 2*10744**2*8 = 1.72 GiB of free memory. If there isn't enough free memory, numpy won't be able to allocate the output array and will immediately fail with the error you see.

这篇关于Scipy径向基函数(scipy.interpolate.rbf)中的Python MemoryError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆