重复的numpy子数组 [英] repeated numpy subarrays

查看:123
本文介绍了重复的numpy子数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的问题的简化.我有一个numpy数组:

This is a simplification of my question. I have a numpy array:

x = np.array([0,1,2,3])

我有一个功能:

def f(y): return y**2

我可以计算f(x).

现在假设我真的想为重复的x计算f(x):

Now suppose I really want to compute f(x) for a repeated x:

x = np.array([0,1,2,3,0,1,2,3,0,1,2,3])

有没有一种方法可以在不创建x重复版本的情况下,并且对f透明?

Is there a way to do this without creating a repeated version of x and in a way that is transparent to f?

在我的特殊情况下,f是一个涉及的函数,参数之一是x.我希望能够在重复x时计算出f,而不必实际重复它,因为它不适合内存.

In my particular case, f is an involved function and one of the arguments is x. I would like to be able to calculate f when x is repeated without actually repeating it as it wont fit into memory.

重写f来处理重复的x是可行的,我希望找到一种巧妙的方法来继承numpy数组以实现此目的.

Rewriting f to handle repeated x would be work and I was hoping for a clever way possibly to subclass a numpy array to do this.

任何提示都值得赞赏.

推荐答案

您可以(几乎)通过大步使用一些技巧来做到这一点.

You can (almost) do this by using a few tricks with strides.

但是,有一些主要警告...

However, there are some major caveats...

import numpy as np
x = np.arange(4)
numrepeats = 3

y = np.lib.stride_tricks.as_strided(x, (numrepeats,)+x.shape, (0,)+x.strides)

print y
x[0] = 9
print y

因此,y现在是x的视图,其中每一行都是x.没有使用新的内存,我们可以将y设置为任意大小.

So, y is now a view into x where each row is x. No new memory is used, and we can make y as large as we like.

例如,我可以这样做:

import numpy as np
x = np.arange(4)
numrepeats = 1e15

y = np.lib.stride_tricks.as_strided(x, (numrepeats,)+x.shape, (0,)+x.strides)

...并且使用的内存不超过x所需的32个字节. (y将使用ram的〜8 Petabytes ,否则)

...and not use any more memory than the 32 bytes required for x. (y would use ~8 Petabytes of ram, otherwise)

但是,如果我们调整y的形状以使其只有一个尺寸,我们将获得一个副本,该副本将使用全部内存.无法使用步幅和形状来描述x的水平"平铺视图,因此任何尺寸小于2维的形状都将返回副本.

However, if we reshape y so that it only has one dimension, we'll get a copy which will use the full amount of memory. There's no way to describe a "horizontally" tiled view of x using strides and shape, so any shape with less than 2 dimensions will return a copy.

此外,如果我们以返回副本的方式对y进行操作(例如,您的示例中的y**2),我们将获得完整副本.

Additionally, if we operate on y in a way that would return a copy (e.g. the y**2 in your example), we'll get a full copy.

因此,就地进行操作更有意义. (例如y **= 2或等效的x **= 2.两者都会完成相同的事情.)

For that reason, it makes more sense to operate on things in-place. (e.g. y **= 2, or equivalently x **= 2. Both will accomplish the same thing.)

即使是通用函数,也可以传入x并将结果放回x中.

Even for a generic function, you can pass in x and place the result back in x.

例如

def f(x):
    return x**3

x[...] = f(x)
print y

y也将被更新,因为它只是x的视图.

y will be updated, as well, as it's just a view into x.

这篇关于重复的numpy子数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆