因素和假人的比例 [英] proportion of factors and dummies

查看:35
本文介绍了因素和假人的比例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个充满因子和假人的数据集,我想看看 dplyr :: group_by(cyl)

I have a data set full of factors and dummies, I want to see the proportion of each value after dplyr::group_by(cyl)

 mtcars; rownames(mtcars) <- NULL
    df <- mtcars[,c(2,8,9)]
    head(df)
     cyl vs am
    1   6  0  1
    2   6  0  1
    3   4  1  1
    4   6  1  0
    5   8  0  0
    6   6  1  0

预期答案我在 cyl 6 6 6 6中有 vs 列,其中两个是1其中两个是0

Expected answer I have in cyl 6 6 6 6 for vs column two of them is 1 two of them 0

   1    0
6 50% 50%
4 100% 0%
8 0%   100%

也与此列 am 相同

推荐答案

这是第一个漏洞:

(df 
    %>% pivot_longer(-cyl)       ## spread out variables (vs, am)
    %>% group_by(cyl,name)   
    %>% mutate(n=n())            ## obs per cyl/var combo
    %>% group_by(cyl,name,value) 
    %>% summarise(prop=n()/n)    ## proportion of 0/1 per cyl/var  
    %>% unique()                 ## not sure why I need this?
    %>% pivot_wider(id_cols=c(cyl,name),names_from=value,values_from=prop)
)

结果:

   cyl name     `0`    `1`
  <dbl> <chr>  <dbl>  <dbl>
1     4 am    0.273   0.727
2     4 vs    0.0909  0.909
3     6 am    0.571   0.429
...

这篇关于因素和假人的比例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆