查找第一个np.nan值的位置的最有效方法是什么? [英] what is the most efficient way to find the position of the first np.nan value?

查看：141 发布时间：2020/5/18 18:52:51 python numpy

本文介绍了查找第一个np.nan值的位置的最有效方法是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

考虑数组a

a = np.array([3, 3, np.nan, 3, 3, np.nan])

我能做

np.isnan(a).argmax()

但这需要找到所有np.nan才能找到第一个.
有没有更有效的方法?

But this requires finding all np.nan just to find the first.
Is there a more efficient way?

我一直在尝试找出是否可以将参数传递给np.argpartition，以使np.nan get排在第一位，而不是最后一位.

I've been trying to figure out if I can pass a parameter to np.argpartition such that np.nan get's sorted first as opposed to last.

关于[dup]的编辑.
这个问题不同的原因有很多.

EDIT regarding [dup].
There are several reasons this question is different.

该问题和答案涉及价值观的平等.这是关于isnan的.
这些答案都遭受我的答案所面临的同一问题.注意，我提供了一个完全有效的答案，但强调了它的效率低下.我正在寻求解决效率低下的问题.

编辑第二个[dup].

EDIT regarding second [dup].

解决平等问题和答案仍然很古老，很可能已经过时.

Still addressing equality and question/answers are old and very possibly outdated.

推荐答案

我要提名

a.argmax()

使用@fuglede's测试数组:

In [1]: a = np.array([np.nan if i % 10000 == 9999 else 3 for i in range(100000)])
In [2]: np.isnan(a).argmax()
Out[2]: 9999
In [3]: np.argmax(a)
Out[3]: 9999
In [4]: a.argmax()
Out[4]: 9999

In [5]: timeit a.argmax()
The slowest run took 29.94 ....
10000 loops, best of 3: 20.3 µs per loop

In [6]: timeit np.isnan(a).argmax()
The slowest run took 7.82 ...
1000 loops, best of 3: 462 µs per loop

我没有安装numba，因此可以进行比较.但是我相对于short的加速比是@fuglede's 6倍.

I don't have numba installed, so can compare that. But my speedup relative to short is greater than @fuglede's 6x.

我正在接受<np.nan的Py3中进行测试，而Py2则发出运行时警告.但是代码搜索表明这并不依赖于该比较.

I'm testing in Py3, which accepts <np.nan, while Py2 raises a runtime warning. But the code search suggests this isn't dependent on that comparison.

/numpy/core/src/multiarray/calculation.c PyArray_ArgMax用轴播放(将感兴趣的一个移动到最后)，并将动作委托给arg_func = PyArray_DESCR(ap)->f->argmax，该函数取决于dtype.

/numpy/core/src/multiarray/calculation.c PyArray_ArgMax plays with axes (moving the one of interest to the end), and delegates the action to arg_func = PyArray_DESCR(ap)->f->argmax, a function that depends on the dtype.

在numpy/core/src/multiarray/arraytypes.c.src中，它看起来像BOOL_argmax短路，一旦遇到True，它就会立即返回.

In numpy/core/src/multiarray/arraytypes.c.src it looks like BOOL_argmax short circuits, returning as soon as it encounters a True.

for (; i < n; i++) {
    if (ip[i]) {
        *max_ind = i;
        return 0;
    }
}

和@fname@_argmax也会在最大nan上短路.在argmin中，np.nan也是最大".

And @fname@_argmax also short circuits on maximal nan. np.nan is 'maximal' in argmin as well.

#if @isfloat@
    if (@isnan@(mp)) {
        /* nan encountered; it's maximal */
        return 0;
    }
#endif

欢迎来自经验丰富的c编码人员的评论，但在我看来，至少对于np.nan而言，普通的argmax会尽快达到您的要求.

Comments from experienced c coders are welcomed, but it appears to me that at least for np.nan, a plain argmax will be as fast you we can get.

在生成a时使用9999进行显示，表明a.argmax时间取决于该值，与短路一致.

Playing with the 9999 in generating a shows that the a.argmax time depends on that value, consistent with short circuiting.

这篇关于查找第一个np.nan值的位置的最有效方法是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

查找第一个np.nan值的位置的最有效方法是什么? [英] what is the most efficient way to find the position of the first np.nan value?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

查找第一个np.nan值的位置的最有效方法是什么? [英] what is the most efficient way to find the position of the first np.nan value?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭