分组,然后根据条件添加比率列 [英] Group by and then add a column for ratio based on condition

查看:89
本文介绍了分组,然后根据条件添加比率列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

说我在R中的数据框如下图所示。
性别是男性/女性。 Familysize是具有相同姓氏的家庭成员的数量。姓氏是姓氏。

Say my dataframe in R looks like the one below. Sex is male/female. Familysize is the number of family members with the same surname. Surname is the surname.

Sex         FamilySize  Surname  
male        1           Abbing  
female      3           Abbott  
male        3           Abbott  
male        3           Abbott  
male        1           Abelseth  
female      1           Abelseth  
male        2           Abelson  
female      2           Abelson  
male        1           Abrahamsson  
female      1           Abrahim 

我想添加一个新列FemaleToFamilySizeRatio,这将给我与每个家庭中的女性。结果如下所示:

I want to add a new column FemaleToFamilySizeRatio, that will give me the ratio of the number of Females in each family. The results would look like below:

Sex         FamilySize  Surname     Ratio  
male        1           Abbing      0  
female      3           Abbott      0.33  
male        3           Abbott      0.33  
male        3           Abbott      0.33  
male        1           Abelseth    0.5  
female      1           Abelseth    0.5  
male        2           Abelson     0.5  
female      2           Abelson     0.5  
male        1           Abrahamsson 0  
female      1           Abrahim     0  

我在桌子旁玩耍,聚合,最有前途的是ddply。我已经达到了某种方向会有所帮助的地步,因为如果我继续前进,我的代码只会变得冗长而丑陋。

I played around with table, aggregate and the most promising one is ddply. I have reached a point where some direction would be helpful, because if I keep going my code will only get long and ugly.

推荐答案

您可以使用data.table

You can do that using data.table

library(data.table)
table_family <- data.table(table_input)
table_family[, Ratio := sum(Sex == "female") / FamilySize[1], by = "Surname"]

这篇关于分组,然后根据条件添加比率列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆