Pandas Groupby:计数和均值合并 [英] Pandas Groupby: Count and mean combined

查看:103
本文介绍了Pandas Groupby:计数和均值合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

与PANDAS合作,尝试总结一个数据框,将其归为某些类别的计数以及这些类别的平均情感评分.

Working with PANDAS to try and summarise a dataframe as a count of certain categories, as well as the means sentiment score for these categories.

有一个表,表中充满了具有不同情感评分的字符串,我想通过说出每个文本来源有多少帖子以及这些帖子的平均情感来对它们进行分组.

There is table full of strings which have different sentiment scores, and I want to group each text source by saying how many posts they have, as well as the average sentiment of these posts.

我的(简化的)数据框如下所示:

My (simplified) dataframe looks like this:

source    text              sent
--------------------------------
bar       some string       0.13
foo       alt string        -0.8
bar       another str       0.7
foo       some text         -0.2
foo       more text         -0.5

此输出应该是这样的:

source    count     mean_sent
-----------------------------
foo       3         -0.5
bar       2         0.415

答案在以下方面:

df['sent'].groupby(df['source']).mean()

但是,只给出每个来源,而且是平均值,没有列标题.

Yet only gives each source and it's mean, with no column headers.

提前谢谢!

推荐答案

您可以使用 groupby aggregate :

You can use groupby with aggregate:

df = df.groupby('source') \
       .agg({'text':'size', 'sent':'mean'}) \
       .rename(columns={'text':'count','sent':'mean_sent'}) \
       .reset_index()
print (df)
  source  count  mean_sent
0    bar      2      0.415
1    foo      3     -0.500

这篇关于Pandas Groupby:计数和均值合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆