如何在python pandas中分组并取一列的计数除以数据帧第二列的唯一计数? [英] How to do group by and take Count of one column divide by count of unique of second column of data frame in python pandas?
本文介绍了如何在python pandas中分组并取一列的计数除以数据帧第二列的唯一计数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有4列的熊猫数据框,分别是"col1","col2","col3"和"col4",现在我想按col1和col2分组,并希望在下面进行汇总.
I have panda data frame with 4 column say 'col1', 'col2', 'col3' and 'col4' now I want to group by col1 and col2 and want to take aggregate say below.
Count(col3)/(Count(unique col4)) As result_col
我该怎么做?我在熊猫上使用MySql.
How do I do this? I am using MySql with pandas.
我已经从互联网上尝试了很多方法,但没有找到确切的解决方案,这就是为什么我在这里发布.请给出不赞成投票的理由,以便我改善问题.
I have tried many things from the internet but not getting an exact solution, that's why I am posting here. Give reason of downvote so I can improve my question.
推荐答案
It seems you need aggregate
by size
and nunique
and then div
output columns:
df = pd.DataFrame({'col1':[1,1,1],
'col2':[4,4,6],
'col3':[7,7,9],
'col4':[3,3,5]})
print (df)
col1 col2 col3 col4
0 1 4 7 3
1 1 4 7 3
2 1 6 9 5
df1 = df.groupby(['col1','col2']).agg({'col3':'size','col4':'nunique'})
df1['result_col'] = df1['col3'].div(df1['col4'])
print (df1)
col4 col3 result_col
col1 col2
1 4 1 2 2.0
6 1 1 1.0
这篇关于如何在python pandas中分组并取一列的计数除以数据帧第二列的唯一计数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文