删除列表中包含的numpy行? [英] Remove numpy rows contained in a list?
问题描述
我有一个numpy数组和一个列表.我要删除列表中包含的行.
I have a numpy array and a list. I want to remove the rows contained in the list.
a = np.zeros((3, 2))
a[0, :] = [1, 2]
l = [(1, 2), (3, 4)]
目前,我尝试通过创建一组a
行来做到这一点,然后排除从l
创建的set
,例如:
Currently I try to do this by making a set of a
's rows, then exclude the set
created from l
, something like:
sa = set(map(tuple, a))
sl = set(l)
np.array(list(sa - sl))
或更简单地
sl = set(l)
np.array([row for row in list(map(tuple, a)) if row not in sl]
当每一行都很短时,这些效果很好.
These work pretty well when each row is short.
有更快的方法吗?我需要优化速度.
Is there a faster way? I need to optimize for speed.
推荐答案
方法#1:这是views
的视图(将每行视为具有扩展dtype的元素)-
Approach #1 : Here's one with views
(viewing each row as an element each with extended dtype) -
# https://stackoverflow.com/a/45313353/ @Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
a1D,l1D = view1D(a,l)
out = a[np.in1d(a1D,l1D,invert=True)]
如果仅需要像set
一样在输出中具有唯一的行,请对获得的输出使用np.unique
-
If you need to have unique rows only in the output as with set
, use np.unique
on the output obtained -
np.unique(out,axis=0)
样品运行输出-
In [72]: a
Out[72]:
array([[1, 2],
[0, 0],
[0, 0]])
In [73]: l
Out[73]: [(1, 2), (3, 4)]
In [74]: out
Out[74]:
array([[0, 0],
[0, 0]])
In [75]: np.unique(out,axis=0)
Out[75]: array([[0, 0]])
方法2::降维原理相同,这是特定于int
dtype数据的矩阵乘法-
Approach #2 : With the same philosophy of reducing dimensionality, here's with matrix-multiplication specific to int
dtype data -
l = np.asarray(l)
shp = np.maximum(a.max(0)+1,l.max(0)+1)
s = np.r_[shp[::-1].cumprod()[::-1][1:],1]
l1D = l.dot(s)
a1D = a.dot(s)
l1Ds = np.sort(l1D)
out = a[l1D[np.searchsorted(l1Ds,a1D)] != a1D]
这篇关于删除列表中包含的numpy行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!