pandas - 找到第一次出现 [英] pandas - find first occurrence

查看:133
本文介绍了 pandas - 找到第一次出现的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个如下的结构化数据框:

Suppose I have a structured dataframe as follows:

df = pd.DataFrame({"A":['a','a','a','b','b'],
                   "B":[1]*5})

A 列之前已排序.我希望找到 df[df.A!='a'] 的第一行索引.最终目标是使用此索引将数据帧基于 A 分成组.

The A column has previously been sorted. I wish to find the first row index of where df[df.A!='a']. The end goal is to use this index to break the data frame into groups based on A.

现在我意识到有一个 groupby 功能.但是,数据框非常大,这是一个简化的玩具示例.由于 A 已经排序,如果我能找到 df.A!='a' 的第一个索引,会更快.因此,重要的是无论您使用什么方法一旦找到第一个元素,扫描就会停止.

Now I realise that there is a groupby functionality. However, the dataframe is quite large and this is a simplified toy example. Since A has been sorted already, it would be faster if I can just find the 1st index of where df.A!='a'. Therefore it is important that whatever method that you use the scanning stops once the first element is found.

推荐答案

idxmaxargmax 将返回最大值的位置,如果最大值出现多次,则返回第一个位置.

idxmax and argmax will return the position of the maximal value or the first position if the maximal value occurs more than once.

df.A.ne('a')

df.A.ne('a').idxmax()

3

numpy 等价物

(df.A.values != 'a').argmax()

3

<小时>

但是,如果 A 已经排序,那么我们可以使用 searchsorted


However, if A has already been sorted, then we can use searchsorted

df.A.searchsorted('a', side='right')

array([3])

numpy 等价物

df.A.values.searchsorted('a', side='right')

3

这篇关于 pandas - 找到第一次出现的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆