一列列表上的 Pandas groupby [英] Pandas groupby on a column of lists

查看:65
本文介绍了一列列表上的 Pandas groupby的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 pandas 数据框,其中有一列包含 lists:

I have a pandas dataframe with a column that contains lists:

df = pd.DataFrame({'List': [['once', 'upon'], ['once', 'upon'], ['a', 'time'], ['there', 'was'], ['a', 'time']], 'Count': [2, 3, 4, 1, 2]})

Count   List
2    [once, upon]
3    [once, upon]
4    [a, time]
1    [there, was]
2    [a, time]

如何组合 List 列并对 Count 列求和?预期结果是:

How can I combine the List columns and sum the Count columns? The expected result is:

Count   List
5     [once, upon]
6     [a, time]
1     [there, was]

我试过了:

df.groupby('List')['Count'].sum()

导致:

TypeError: unhashable type: 'list'

推荐答案

一种方法是先转换为元组.这是因为 pandas.groupby 要求键是可散列的.元组是不可变和可散列的,但列表不是.

One way is to convert to tuples first. This is because pandas.groupby requires keys to be hashable. Tuples are immutable and hashable, but lists are not.

res = df.groupby(df['List'].map(tuple))['Count'].sum()

结果:

List
(a, time)       6
(once, upon)    5
(there, was)    1
Name: Count, dtype: int64

如果您需要数据帧中的列表形式的结果,您可以转换回来:

If you need the result as lists in a dataframe, you can convert back:

res = df.groupby(df['List'].map(tuple))['Count'].sum()
res['List'] = res['List'].map(list)

#            List  Count
# 0     [a, time]      6
# 1  [once, upon]      5
# 2  [there, was]      1

这篇关于一列列表上的 Pandas groupby的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆