dplyr：将计数出现放入新变量中 [英] dplyr: put count occurrences into new variable

查看：78 发布时间：2017/7/13 20:14:26 r dplyr

本文介绍了dplyr：将计数出现放入新变量中的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

希望得到一个dplyr代码的手，但不能弄清楚。已经看到许多变量在这里描述了类似的问题（总结一个因素的计数和将值发生排列为新变量，如何使用dplyr？R）？但是我的任务稍微小些。

给定一个数据框，我如何计数变量的频率，并将其放在新的变量中。

  set.seed（9）
 df < data.frame（
 group = c（rep（1,5），rep（2,5）），
 var1 = round（runif（10,1,3），0））

然后我们有：

想要第三列指示每组（ group ）多少次 var1 发生，在这个例子中，这将是：count =（4,4,4,4,1， 1,3,3,3,1）。
我试过 - 没有成功 - 像：

  df％>％group_by（group）％>％ rowwise（）％>％do（count = nrow（。$ var1））

赞赏！

解决方案

所有您需要做的是通过两列组和var1分组您的数据： p>

  df％>％group_by（group，var1）％>％mutate（count = n（））
＃来源：本地数据框[10 x 3] 
 #Groups：group，var1 
＃
＃group var1 count 
＃1 1 1 4 
＃2 1 1 4 
＃3 1 1 4 
＃4 1 1 4 
＃5 1 2 1 
＃6 2 1 1 
＃7 2 2 3 
 ＃8 2 2 3 
＃9 2 2 3 
＃10 2 3 1

< h3>评论后编辑

以下是您不应该如此做的一个例子：

 code> df％>％group_by（group，var1）％>％do（data.frame（。，coun t = length（。$ group）））

具有 n的dplyr实现（）肯定会更快，更干净，更短，应该始终比上述这样的实现更为优先。

Would like to get a hand on dplyr code, but cannot figure this out. Have seen a similar issue described here for many variables (summarizing counts of a factor with dplyr and Putting rowwise counts of value occurences into new variables, how to do that in R with dplyr?), however my task is somewhat smaller.
Given a data frame, how do I count the frequency of a variable and place that in a new variable.

set.seed(9)
df <- data.frame(
    group=c(rep(1,5), rep(2,5)),
    var1=round(runif(10,1,3),0))

Then we have:

>df
   group var1
1      1    1
2      1    1
3      1    1
4      1    1
5      1    2
6      2    1
7      2    2
8      2    2
9      2    2
10     2    3

Would like a third column indicating per-group (group) how many times var1 occurs, in this example this would be: count=(4,4,4,4,1,1,3,3,3,1). I tried - without success - things like:

df %>%  group_by(group) %>% rowwise() %>% do(count = nrow(.$var1))

Explanations are very appreciated!

解决方案

All you need to do is group your data by both columns, "group" and "var1":

df %>% group_by(group, var1) %>% mutate(count = n())
#Source: local data frame [10 x 3]
#Groups: group, var1
#
#   group var1 count
#1      1    1     4
#2      1    1     4
#3      1    1     4
#4      1    1     4
#5      1    2     1
#6      2    1     1
#7      2    2     3
#8      2    2     3
#9      2    2     3
#10     2    3     1

Edit after comment

Here's an example of how you SHOULD NOT DO IT:

df %>% group_by(group, var1) %>% do(data.frame(., count = length(.$group)))

The dplyr implementation with n() is for sure much faster, cleaner and shorter and should always be preferred over such implementations as above.

这篇关于dplyr：将计数出现放入新变量中的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

dplyr：将计数出现放入新变量中 [英] dplyr: put count occurrences into new variable

问题描述

Edit after comment

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

dplyr：将计数出现放入新变量中 [英] dplyr: put count occurrences into new variable

问题描述

Edit after comment

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭