有没有办法检查NumPy数组是否共享相同的数据? [英] Is there a way to check if NumPy arrays share the same data?
问题描述
我的印象是,在NumPy中,两个数组可以共享相同的内存.请看以下示例:
My impression is that in NumPy, two arrays can share the same memory. Take the following example:
import numpy as np
a=np.arange(27)
b=a.reshape((3,3,3))
a[0]=5000
print (b[0,0,0]) #5000
#Some tests:
a.data is b.data #False
a.data == b.data #True
c=np.arange(27)
c[0]=5000
a.data == c.data #True ( Same data, not same memory storage ), False positive
很显然,b
没有复制a
;它只是创建了一些新的元数据,并将其附加到a
正在使用的同一内存缓冲区中.有没有办法检查两个数组是否引用了相同的内存缓冲区?
So clearly b
didn't make a copy of a
; it just created some new meta-data and attached it to the same memory buffer that a
is using. Is there a way to check if two arrays reference the same memory buffer?
我的第一印象是使用a.data is b.data
,但是返回false.我可以做a.data == b.data
返回True的操作,但是我不认为检查以确保a
和b
共享相同的内存缓冲区,只是确保a
引用的内存块和a
引用的内存块b
具有相同的字节.
My first impression was to use a.data is b.data
, but that returns false. I can do a.data == b.data
which returns True, but I don't think that checks to make sure a
and b
share the same memory buffer, only that the block of memory referenced by a
and the one referenced by b
have the same bytes.
推荐答案
我认为jterrace的答案可能是最好的方法,但这是另一种可能性.
I think jterrace's answer is probably the best way to go, but here is another possibility.
def byte_offset(a):
"""Returns a 1-d array of the byte offset of every element in `a`.
Note that these will not in general be in order."""
stride_offset = np.ix_(*map(range,a.shape))
element_offset = sum(i*s for i, s in zip(stride_offset,a.strides))
element_offset = np.asarray(element_offset).ravel()
return np.concatenate([element_offset + x for x in range(a.itemsize)])
def share_memory(a, b):
"""Returns the number of shared bytes between arrays `a` and `b`."""
a_low, a_high = np.byte_bounds(a)
b_low, b_high = np.byte_bounds(b)
beg, end = max(a_low,b_low), min(a_high,b_high)
if end - beg > 0:
# memory overlaps
amem = a_low + byte_offset(a)
bmem = b_low + byte_offset(b)
return np.intersect1d(amem,bmem).size
else:
return 0
示例:
>>> a = np.arange(10)
>>> b = a.reshape((5,2))
>>> c = a[::2]
>>> d = a[1::2]
>>> e = a[0:1]
>>> f = a[0:1]
>>> f = f.reshape(())
>>> share_memory(a,b)
80
>>> share_memory(a,c)
40
>>> share_memory(a,d)
40
>>> share_memory(c,d)
0
>>> share_memory(a,e)
8
>>> share_memory(a,f)
8
以下是显示每个share_memory(a,a[::2])
调用时间的图,该时间是计算机上a
中元素数量的函数.
Here is a plot showing the time for each share_memory(a,a[::2])
call as a function of the number of elements in a
on my computer.
这篇关于有没有办法检查NumPy数组是否共享相同的数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!