如何遍历分组的 pandas 数据框? [英] How to loop over grouped Pandas dataframe?

查看:58
本文介绍了如何遍历分组的 pandas 数据框?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

DataFrame:

DataFrame:

  c_os_family_ss c_os_major_is l_customer_id_i
0      Windows 7                         90418
1      Windows 7                         90418
2      Windows 7                         90418

代码:

print df
for name, group in df.groupby('l_customer_id_i').agg(lambda x: ','.join(x)):
    print name
    print group

我正试图循环访问汇总数据,但出现错误:

I'm trying to just loop over the aggregated data, but I get the error:

ValueError:太多值无法解包

ValueError: too many values to unpack

@EdChum,这是预期的输出:

@EdChum, here's the expected output:

                                                    c_os_family_ss  \
l_customer_id_i
131572           Windows 7,Windows 7,Windows 7,Windows 7,Window...
135467           Windows 7,Windows 7,Windows 7,Windows 7,Window...

                                                     c_os_major_is
l_customer_id_i
131572           ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...
135467           ,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,...

输出不是问题,我希望遍历每个组.

The output is not the problem, I wish to loop over every group.

推荐答案

df.groupby('l_customer_id_i').agg(lambda x: ','.join(x))已经返回一个数据帧,因此您无法再遍历这些组.

df.groupby('l_customer_id_i').agg(lambda x: ','.join(x)) does already return a dataframe, so you cannot loop over the groups anymore.

通常:

  • df.groupby(...)返回一个GroupBy对象(一个DataFrameGroupBy或SeriesGroupBy),并可以使用它循环访问组(如文档

  • df.groupby(...) returns a GroupBy object (a DataFrameGroupBy or SeriesGroupBy), and with this, you can iterate through the groups (as explained in the docs here). You can do something like:

grouped = df.groupby('A')

for name, group in grouped:
    ...

  • 当在groupby上应用功能时,在您的示例df.groupby(...).agg(...)中(但是也可以是transformapplymean,...),您组合将功能应用到一个数据帧中的不同组的结果(groupby的"split-apply-combine"范式的应用和合并"步骤).因此,此操作的结果将始终是一个DataFrame(或Series,具体取决于所应用的函数).

  • When you apply a function on the groupby, in your example df.groupby(...).agg(...) (but this can also be transform, apply, mean, ...), you combine the result of applying the function to the different groups together in one dataframe (the apply and combine step of the 'split-apply-combine' paradigm of groupby). So the result of this will always be again a DataFrame (or a Series depending on the applied function).

    这篇关于如何遍历分组的 pandas 数据框?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

  • 查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆