对pandas中的嵌套groupby执行min()操作 [英] min() operation on nested groupby in pandas
问题描述
df = pd.DataFrame({'ANIMAL':[1,1,1,1, 1,2,2,2],
'AGE_D':[3,6,47,377,698,1,9,241],
'AGE_Y':[1,1,1,2,2,1, 1,1]})
我想在animal和age_y中做一个嵌套组,然后选择小组的最小值。
希望的输出是:
ANIMAL AGE_Y AGE_D
1 1 3
1 2 377
2 1 1
我可以在动物内不嵌套的情况下做到这一点,例如如果我的df2 = ANIMAL子集= 1
,那么
df2.loc [df2.groupby('AGE_Y')) ['AGE_D'] .idxmin()]
但是,这个小组没有成功。我在猜测我的操作顺序是错误的...
我应该如何处理这个问题?
我认为你需要添加列到 groupby
- group by列 ANIMAL
和 AGE_Y
df = df2.loc [df2.groupby(['ANIMAL','AGE_Y'])[ 'AGE_D'] .idxmin()]
df = df [['ANIMAL','AGE_Y','AGE_D']]
print(df)
ANIMAL AGE_Y AGE_D
0 1 1 3
3 1 2 377
5 2 1 1
I am just getting to know pandas and I can't get over a conceptual problem. My dataframe is as follows:
df=pd.DataFrame({'ANIMAL':[1,1,1,1,1,2,2,2],
'AGE_D' : [3,6,47,377,698,1,9,241],
'AGE_Y' : [1,1,1,2,2,1,1,1]})
I would like to do a nested group within animal and age_y and then select the min on the subgroup. Desired output would be then:
ANIMAL AGE_Y AGE_D
1 1 3
1 2 377
2 1 1
I can do this without nesting within animal, e.g. if my df2 = subset for ANIMAL=1 then
df2.loc[df2.groupby('AGE_Y')['AGE_D'].idxmin()]
But all the things I tried with nesting the animal in the group by were unsuccesful. I am guessing that my order of the operations is wrong... How should I go about this?
I think you need add columns to groupby
- group by columns ANIMAL
and AGE_Y
:
df = df2.loc[df2.groupby(['ANIMAL','AGE_Y'])['AGE_D'].idxmin()]
df = df[['ANIMAL','AGE_Y','AGE_D']]
print (df)
ANIMAL AGE_Y AGE_D
0 1 1 3
3 1 2 377
5 2 1 1
这篇关于对pandas中的嵌套groupby执行min()操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!