因素和假人的比例 [英] proportion of factors and dummies
本文介绍了因素和假人的比例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个充满因子和假人的数据集,我想看看 dplyr :: group_by(cyl)
I have a data set full of factors and dummies, I want to see the proportion of each value after dplyr::group_by(cyl)
mtcars; rownames(mtcars) <- NULL
df <- mtcars[,c(2,8,9)]
head(df)
cyl vs am
1 6 0 1
2 6 0 1
3 4 1 1
4 6 1 0
5 8 0 0
6 6 1 0
预期答案我在 cyl
6 6 6 6中有 vs
列,其中两个是1其中两个是0
Expected answer
I have in cyl
6 6 6 6 for vs
column two of them is 1 two of them 0
1 0
6 50% 50%
4 100% 0%
8 0% 100%
也与此列 am
相同
推荐答案
这是第一个漏洞:
(df
%>% pivot_longer(-cyl) ## spread out variables (vs, am)
%>% group_by(cyl,name)
%>% mutate(n=n()) ## obs per cyl/var combo
%>% group_by(cyl,name,value)
%>% summarise(prop=n()/n) ## proportion of 0/1 per cyl/var
%>% unique() ## not sure why I need this?
%>% pivot_wider(id_cols=c(cyl,name),names_from=value,values_from=prop)
)
结果:
cyl name `0` `1`
<dbl> <chr> <dbl> <dbl>
1 4 am 0.273 0.727
2 4 vs 0.0909 0.909
3 6 am 0.571 0.429
...
这篇关于因素和假人的比例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文