有没有办法检查NumPy数组是否共享相同的数据? [英] Is there a way to check if NumPy arrays share the same data?

查看:339
本文介绍了有没有办法检查NumPy数组是否共享相同的数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的印象是,在NumPy中,两个数组可以共享相同的内存.请看以下示例:

My impression is that in NumPy, two arrays can share the same memory. Take the following example:

import numpy as np
a=np.arange(27)
b=a.reshape((3,3,3))
a[0]=5000
print (b[0,0,0]) #5000

#Some tests:
a.data is b.data #False
a.data == b.data #True

c=np.arange(27)
c[0]=5000
a.data == c.data #True ( Same data, not same memory storage ), False positive

很显然,b没有复制a;它只是创建了一些新的元数据,并将其附加到a正在使用的同一内存缓冲区中.有没有办法检查两个数组是否引用了相同的内存缓冲区?

So clearly b didn't make a copy of a; it just created some new meta-data and attached it to the same memory buffer that a is using. Is there a way to check if two arrays reference the same memory buffer?

我的第一印象是使用a.data is b.data,但是返回false.我可以做a.data == b.data返回True的操作,但是我不认为检查以确保ab共享相同的内存缓冲区,只是确保a引用的内存块和a引用的内存块b具有相同的字节.

My first impression was to use a.data is b.data, but that returns false. I can do a.data == b.data which returns True, but I don't think that checks to make sure a and b share the same memory buffer, only that the block of memory referenced by a and the one referenced by b have the same bytes.

推荐答案

我认为jterrace的答案可能是最好的方法,但这是另一种可能性.

I think jterrace's answer is probably the best way to go, but here is another possibility.

def byte_offset(a):
    """Returns a 1-d array of the byte offset of every element in `a`.
    Note that these will not in general be in order."""
    stride_offset = np.ix_(*map(range,a.shape))
    element_offset = sum(i*s for i, s in zip(stride_offset,a.strides))
    element_offset = np.asarray(element_offset).ravel()
    return np.concatenate([element_offset + x for x in range(a.itemsize)])

def share_memory(a, b):
    """Returns the number of shared bytes between arrays `a` and `b`."""
    a_low, a_high = np.byte_bounds(a)
    b_low, b_high = np.byte_bounds(b)

    beg, end = max(a_low,b_low), min(a_high,b_high)

    if end - beg > 0:
        # memory overlaps
        amem = a_low + byte_offset(a)
        bmem = b_low + byte_offset(b)

        return np.intersect1d(amem,bmem).size
    else:
        return 0

示例:

>>> a = np.arange(10)
>>> b = a.reshape((5,2))
>>> c = a[::2]
>>> d = a[1::2]
>>> e = a[0:1]
>>> f = a[0:1]
>>> f = f.reshape(())
>>> share_memory(a,b)
80
>>> share_memory(a,c)
40
>>> share_memory(a,d)
40
>>> share_memory(c,d)
0
>>> share_memory(a,e)
8
>>> share_memory(a,f)
8

以下是显示每个share_memory(a,a[::2])调用时间的图,该时间是计算机上a中元素数量的函数.

Here is a plot showing the time for each share_memory(a,a[::2]) call as a function of the number of elements in a on my computer.

这篇关于有没有办法检查NumPy数组是否共享相同的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆