遍历Pandas groupby对象的子集 [英] Iterate over a subset of a Pandas groupby object

查看:1434
本文介绍了遍历Pandas groupby对象的子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Pandas groupby对象,我想遍历第一个n组.我尝试过:

I have a Pandas groupby object, and I would like to iterate over the first n groups. I've tried:

import pandas as pd
df = pd.DataFrame({'A':['a','a','a','b','b','c','c','c','c','d','d'],
                   'B':[1,2,3,4,5,6,7,8,9,10,11]})

df_grouped = df.groupby('A')
i = 0
n = 2 # for instance
for name, group in df_grouped:
    #DO SOMETHING
    if i == n: 
        break
    i += 1 

group_list = list(df_grouped.groups.keys())[:n]
for name in group_list:
    group = df_grouped.get_group(name)
    #DO SOMETHING

但是我想知道是否还有一种更优雅/pythonic的方式来做到这一点?

but I wondered if there was a more elegant/pythonic way to do it?

我的实际groupby中有1000个组,我只想对一个子集执行一个操作,只是为了获得整个数据的印象.

My actual groupby has 1000s of groups within it, and I'd like to only perform an operation on a subset, just to get an impression of the data as a whole.

推荐答案

您可以使用原始df进行过滤,然后我们可以做您需要做的所有其他事情

You can filter with your original df, then we can do all the other you need to do

yourdf=df[df.groupby('A').ngroup()<=1]


yourdf=df[pd.factorize(df.A)[0]<=1]

这篇关于遍历Pandas groupby对象的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆