Python Pandas:在groupby中选择第二个最小值 [英] Python pandas: select 2nd smallest value in groupby
本文介绍了Python Pandas:在groupby中选择第二个最小值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个示例DataFrame,如下所示:
I have an example DataFrame like the following:
import pandas as pd
import numpy as np
df = pd.DataFrame({'ID':[1,2,2,2,3,3,], 'date':array(['2000-01-01','2002-01-01','2010-01-01','2003-01-01','2004-01-01','2008-01-01'],dtype='datetime64[D]')})
我试图在每个ID组中获得第二天最早的成绩.所以我写了以下函数:
I am trying to get the 2nd earliest day in each ID group. So I wrote the following funciton:
def f(x):
if len(x)==1:
return x[0]
else:
x.sort()
return x[1]
然后我写道:
df.groupby('ID').date.apply(lambda x:f(x))
结果是错误.
您能找到一种使这项工作成功的方法吗?
Could you find a way to make this work?
推荐答案
这需要0.14.1.并且会非常有效,尤其是在您有大型群组的情况下(因为这不需要对它们进行完全排序).
This requires 0.14.1. And will be quite efficient, especially if you have large groups (as this doesn't require fully sorting them).
In [32]: df.groupby('ID')['date'].nsmallest(2)
Out[32]:
ID
1 0 2000-01-01
2 1 2002-01-01
3 2003-01-01
3 4 2004-01-01
5 2008-01-01
dtype: datetime64[ns]
In [33]: df.groupby('ID')['date'].nsmallest(2).groupby(level='ID').last()
Out[33]:
ID
1 2000-01-01
2 2003-01-01
3 2008-01-01
dtype: datetime64[ns]
这篇关于Python Pandas:在groupby中选择第二个最小值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文