如何获取numpy/scipy中特定百分位数的索引? [英] How do I get the index of a specific percentile in numpy / scipy?

查看:238
本文介绍了如何获取numpy/scipy中特定百分位数的索引?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经查看了此答案,其中解释了如何计算特定百分位数的值,以及

I have looked this answer which explains how to compute the value of a specific percentile, and this answer which explains how to compute the percentiles that correspond to each element.

  • 使用第一个解决方案,我可以计算值并扫描原始数组以找到索引.

  • Using the first solution, I can compute the value and scan the original array to find the index.

使用第二种解决方案,我可以扫描整个输出数组以寻找所需的百分位数.

Using the second solution, I can scan the entire output array for the percentile I'm looking for.

但是,如果我想知道与特定百分位数相对应的索引(在原始数组中)(或包含与该索引组成的元素最接近的索引),两者都需要进行额外的扫描.

However, both require an additional scan if I want to know the index (in the original array) that corresponds to a particular percentile (or the index containing the element closest to that index).

是否有更直接或内置的方法来获取与百分位数相对应的索引?

Is there is more direct or built-in way to get the index which corresponds to a percentile?

注意:我的数组未排序,我希望索引在未排序的原始数组中.

Note: My array is not sorted and I want the index in the original, unsorted array.

推荐答案

这有点令人费解,但是使用np.argpartition可以得到想要的结果.让我们简单地整理一下数组:

It is a little convoluted, but you can get what you are after with np.argpartition. Lets take an easy array and shuffle it:

>>> a = np.arange(10)
>>> np.random.shuffle(a)
>>> a
array([5, 6, 4, 9, 2, 1, 3, 0, 7, 8])

如果您要查找例如分位数0.25的索引,它对应于排序数组的位置idx中的项目:

If you want to find e.g. the index of quantile 0.25, this would correspond to the item in position idx of the sorted array:

>>> idx = 0.25 * (len(a) - 1)
>>> idx
2.25

您需要弄清楚如何将其四舍五入为一个整数,例如,您使用最接近的整数:

You need to figure out how to round that to an int, say you go with nearest integer:

>>> idx = int(idx + 0.5)
>>> idx
2

如果您现在拨打np.argpartition,这就是您得到的:

If you now call np.argpartition, this is what you get:

>>> np.argpartition(a, idx)
array([7, 5, 4, 3, 2, 1, 6, 0, 8, 9], dtype=int64)
>>> np.argpartition(a, idx)[idx]
4
>>> a[np.argpartition(a, idx)[idx]]
2

很容易检查最后两个表达式分别是.25分位数的索引和值.

It is easy to check that these last two expressions are, respectively, the index and the value of the .25 quantile.

这篇关于如何获取numpy/scipy中特定百分位数的索引?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆