如何以矢量化方式在特定轴上找到二维数组的唯一矢量? [英] How to find unique vectors of a 2d array over a particular axis in a vectorized manner?
问题描述
我有一个形状为(n,t)
的数组,我想将其视为n-vectors
的时间序列.
I have an array of shape (n,t)
which I'd like to treat as a timeseries of n-vectors
.
我想知道沿着t-dimension
的唯一n-vector
值以及每个唯一矢量的关联t-indices
.我很高兴使用任何合理的均等定义(例如numpy.unique
将采用浮点数)
I'd like to know the unique n-vector
values that exist along the t-dimension
as well as the associated t-indices
for each unique vector. I'm happy to use any reasonable definition of equality (e.g. numpy.unique
will take floats)
通过t
上的Python循环,这很容易,但是我希望有一种矢量化方法.
This is easy with a Python loop over t
but I'm hoping for a vectorized approach.
在某些特殊情况下,可以通过将n-vectors
分解为标量(并在一维结果上使用numpy.unique
)来完成,例如如果您有布尔值,则可以将向量化的dot
与(2**k)
向量一起使用,以将(布尔向量)转换为整数,但是我正在寻找一种比较通用的解决方案.
In some special cases it can be done by collapsing the n-vectors
into scalars (and using numpy.unique
on the 1d result), e.g. if you had booleans you could use a vectorized dot
with the (2**k)
vector to convert (boolean vectors) to integers, but I'm looking for a fairly general solution.
推荐答案
如果数组的形状为(t,n)-则每个n向量的数据在内存中是连续的-您可以创建一个视图将二维数组作为一维结构化数组,然后在该视图上使用numpy.unique.
If the shape of your array was (t, n)--so the data for each n-vector was contiguous in memory--you could create a view of the 2-d array as a 1-d structured array, and then use numpy.unique on this view.
如果您可以更改阵列的存储约定,或者不介意制作转置阵列的副本,那么这可能对您有用.
If you can change the storage convention of your array, or if you don't mind making a copy of the transposed array, this could work for you.
这是一个例子:
import numpy as np
# Demo data.
x = np.array([[1,2,3],
[2,0,0],
[1,2,3],
[3,2,2],
[2,0,0],
[2,1,2],
[3,2,1],
[2,0,0]])
# View each row as a structure, with field names 'a', 'b' and 'c'.
dt = np.dtype([('a', x.dtype), ('b', x.dtype), ('c', x.dtype)])
y = x.view(dtype=dt).squeeze()
# Now np.unique can be used. See the `unique` docstring for
# a description of the options. You might not need `idx` or `inv`.
u, idx, inv = np.unique(y, return_index=True, return_inverse=True)
print("Unique vectors")
print(u)
这篇关于如何以矢量化方式在特定轴上找到二维数组的唯一矢量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!