生成指数numpy的阵列重复数据删除组点 [英] Generating numpy array of indices for a deduplicated set of points

查看：350 发布时间：2016/5/31 20:47:43 python arrays numpy deduplication

本文介绍了生成指数numpy的阵列重复数据删除组点的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个最低点的10秒数千（最多3十亿）其中一些被复制的阵列。我想删除重复点，并生成一个索引数组，它保留了重复点的原始序列。

例如：

  X = [（0,0），＃（X1，Y1）
     （1，0），＃（X2，Y2）
     （1，1），＃（X3，Y3）
     （0，0）]＃（X4，Y4）

重复数据删除的x，我们有Y：

  Y =名单（套（X））= [（1,0），＃（x2，y2）
                    （0，0），＃（X1，Y1）和（X4，Y4）
                    （1，1）]＃（X3，Y3）

然后我们将有一个结果索引数组，Z：

  Z = [1，＃（X1，Y1）
     0，＃（X2，Y2）
     2，＃（X3，Y3）
     1]＃（X4，Y4）

有没有获得z的numpy的样的方式？这里有一个强力实施

  Z = []
对于以x each_point：
    指数= y.index（each_point）
    z.append（指数）

解决方案

 χ2= np.ascontiguousarray（X）。查看（np.dtype（（np.void，x.dtype。 itemsize * x.shape [1]）））
y_temp，Z = np.unique（X2，return_inverse = TRUE）
Y = y_temp.view（DTYPE ='的int64'）。重塑（LEN（y_temp），2）
打印（Y）
打印（Z）

收益

  [0]
 [1 0]
 [1 1]]

和

  [0 1 2 0]

来源：查找numpy.array

唯一行

I have an array of a minimum of 10s of thousands of points (up to 3 billion) some of which are duplicated. I'd like to deduplicate the points and generate an index array which retains the original sequence of the duplicated points.

For example:

x = [(0, 0),  # (x1, y1)
     (1, 0),  # (x2, y2)
     (1, 1),  # (x3, y3)
     (0, 0)]  # (x4, y4)

Deduplicating x, we have y:

y = list(set(x)) = [(1, 0),  # (x2, y2)
                    (0, 0),  # (x1, y1) and (x4, y4)
                    (1, 1)]  # (x3, y3)

And then we would have a resulting index array, z:

z = [1,  # (x1, y1) 
     0,  # (x2, y2)
     2,  # (x3, y3)
     1]  # (x4, y4)

Is there a numpy-like way of obtaining z? Here's a brute-force implementation:

z = []
for each_point in x:
    index = y.index(each_point)
    z.append(index)

解决方案

x2 = np.ascontiguousarray(x).view(np.dtype((np.void, x.dtype.itemsize * x.shape[1])))
y_temp, z = np.unique(x2, return_inverse=True)
y = y_temp.view(dtype='int64').reshape(len(y_temp), 2)
print(y)
print(z)

yields

[[0 0]
 [1 0]
 [1 1]]

and

[0 1 2 0]

Credit: Find unique rows in numpy.array

这篇关于生成指数numpy的阵列重复数据删除组点的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

生成指数numpy的阵列重复数据删除组点 [英] Generating numpy array of indices for a deduplicated set of points

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

生成指数numpy的阵列重复数据删除组点 [英] Generating numpy array of indices for a deduplicated set of points

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭