如何在python pandas中分组并取一列的计数除以数据帧第二列的唯一计数? [英] How to do group by and take Count of one column divide by count of unique of second column of data frame in python pandas?

查看:667
本文介绍了如何在python pandas中分组并取一列的计数除以数据帧第二列的唯一计数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有4列的熊猫数据框,分别是"col1","col2","col3"和"col4",现在我想按col1和col2分组,并希望在下面进行汇总.

I have panda data frame with 4 column say 'col1', 'col2', 'col3' and 'col4' now I want to group by col1 and col2 and want to take aggregate say below.

Count(col3)/(Count(unique col4)) As result_col

我该怎么做?我在熊猫上使用MySql.

How do I do this? I am using MySql with pandas.

我已经从互联网上尝试了很多方法,但没有找到确切的解决方案,这就是为什么我在这里发布.请给出不赞成投票的理由,以便我改善问题.

I have tried many things from the internet but not getting an exact solution, that's why I am posting here. Give reason of downvote so I can improve my question.

推荐答案

似乎您需要 nunique ,然后

It seems you need aggregate by size and nunique and then div output columns:

df = pd.DataFrame({'col1':[1,1,1],
                   'col2':[4,4,6],
                   'col3':[7,7,9],
                   'col4':[3,3,5]})

print (df)
   col1  col2  col3  col4
0     1     4     7     3
1     1     4     7     3
2     1     6     9     5

df1 = df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique'})
df1['result_col'] = df1['col3'].div(df1['col4'])
print (df1)
           col4  col3  result_col
col1 col2                        
1    4        1     2         2.0
     6        1     1         1.0

这篇关于如何在python pandas中分组并取一列的计数除以数据帧第二列的唯一计数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆