在保留其顺序的同时访问大型 numpy 数组 [英] Accessing a large numpy array while preserving its order

查看:51
本文介绍了在保留其顺序的同时访问大型 numpy 数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过索引 idx 访问一个 numpy 数组 data,但仍保留 data 中的顺序.下面是一个示例,其中访问数组的顺序与原始数组中的顺序不同.

I would like to access an numpy array data via an index idx, but still preserving the order in data. Below is an example where the array is accessed with an order different from the one in the original array.

In [125]: data = np.array([2, 2.2, 2.5])

In [126]: idx=np.array([1,0])

In [127]: data[idx]
Out[127]: array([2.2, 2. ])

我希望得到 [2,2.2] 代替.有没有一种高效的方法来做到这一点?在我的问题设置中,我有超过 100 万个浮点数的数据,以及 10 万个整数的 idx.

I hope to get [2,2.2] instead. Is there a highly efficient way to do so? In my problem setting, I have the data with more than a million floating-point numbers, and idx with a 0.1 million integers.

重要信息:如果需要,可以对数组 data 进行预处理.数据来自图像处理工作.例如,如果我们需要预先对data进行排序,那么在衡量性能时就不会考虑排序所消耗的时间.另一方面,idx 是我不想在运行时处理太多的东西,因为必须计算花费在它上面的时间.例如.使用 O(n log n) 算法对 idx 进行排序可能过于昂贵.

Important info: The array data can be preprocessed if needed. The data come from an image processing work. For example, if we need to sort data beforehand, the time consumed on sorting would not be considered when measuring the performance. On the other hands, idx is something I would rather not process too much at runtime as time spent on it has to be counted. E.g. soriting idx with an O(n log n) algorithm can be too expensive.

推荐答案

创建一个布尔值 'mask'

Creat a boolean 'mask'

 mask = np.zeros(data.shape, bool)
 mask[idx] = True
 res = data[mask]

这篇关于在保留其顺序的同时访问大型 numpy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆