Python大 pandas 等同于R groupby mutate [英] Python pandas equvilant to R groupby mutate
本文介绍了Python大 pandas 等同于R groupby mutate的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
所以在R中,当我有一个由4列组成的数据框时,调用它 df
,我想用一个组的乘积来计算比率,我可以它以这样的方式:
So in R when I have a data frame consisting of say 4 columns, call it df
and I want to compute the ratio by sum product of a group, I can it in such a way:
// generate data
df = data.frame(a=c(1,1,0,1,0),b=c(1,0,0,1,0),c=c(10,5,1,5,10),d=c(3,1,2,1,2));
| a b c d |
| 1 1 10 3 |
| 1 0 5 1 |
| 0 0 1 2 |
| 1 1 5 1 |
| 0 0 10 2 |
// compute sum product ratio
df = df%>% group_by(a,b) %>%
mutate(
ratio=c/sum(c*d)
);
| a b c d ratio |
| 1 1 10 3 0.286 |
| 1 1 5 1 0.143 |
| 1 0 5 1 1 |
| 0 0 1 2 0.045 |
| 0 0 10 2 0.454 |
但是在python中,我得到了循环。
我知道应该比python中的原始循环更优雅,任何人都有任何想法?
But in python I got to result to loops. I know there should be a more elegant way than raw loops in python, anyone got any ideas?
推荐答案
可以使用与 groupby()
类似的语法 apply()
:
It can be done with similar syntax with groupby()
and apply()
:
df['ratio'] = df.groupby(['a','b'], group_keys=False).apply(lambda g: g.c/(g.c * g.d).sum())
这篇关于Python大 pandas 等同于R groupby mutate的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文