使用dplyr根据另一列中的字符串值将一列中的数据分组 [英] Group data in one column based on a string value in another column using dplyr

查看:135
本文介绍了使用dplyr根据另一列中的字符串值将一列中的数据分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在电子表格中有以下数据,其中列出了为学生分配的任务。

  df<-数据.frame(
Student = c( A, A, A, A, B, B, B, C, D, D , D, D),
Task = c(家庭作业,课堂作业,作业,海报,海报,作业,作业,作业, 班级,作业,任务,海报),
Status = c(已完成,待处理,未执行,未执行,已完成,未执行 ,未执行,已完成,已完成,待处理,待处理,待处理),
stringsAsFactors = FALSE)




我使用了以下代码段,但它似乎不起作用。

  df%>%group_by(Task)%&%;%
summary(
Count = nrow(df [df $ Status =='Completed',])

编辑:更新了问题,以添加实际数据集而不是屏幕快照。

解决方案

您可以根据该列过滤数据,然后执行任务计数:

  df<-data.frame(
学生= c(
rep( A,4),rep( B,4),rep( C, 4),rep( D,4)
),
task = rep(
c( Home, Class, Assign, Poster),4
),
res =样本(
c(已完成,待处理,未执行),
16,TRUE



库(dplyr)
#>
#>附加软件包:‘dplyr’
#>下列对象被 package:stats屏蔽:
#>
#>过滤器,滞后
#>下列对象从 package:base中被屏蔽:
#>
#>相交,setdiff,setequal,联合
df%&%;%
filter(res == Completed)%>%
count(task)
#> #小动作:4 x 2
#>任务n
#> < fct> < int>
#> 1分配1
#> 2类1
#> 3家1
#> 4海报3

创建于2019-09-29由 reprex包(v0.3.0)


I have the below data in a spreadsheet where the tasks assigned for the students are listed.

df <- data.frame(
Student=c("A","A","A","A","B","B","B","C","D","D","D","D"),
Task=c("Homework","Classwork","Assignment","Poster","Poster","Homework","Assignment","Homework","Classwork","Homework","Assignment","Poster"),
Status=c("Completed","Pending","Not performed","Not performed","Completed","Not performed","Not performed","Completed","Completed","Pending","Pending","Pending"), 
stringsAsFactors = FALSE)

I would like to group the data at task level and find the count for each task based on 'Status' being 'Completed'. Below is my expected output

I used the below snippet but it does not seem to work. Any help is appreciated.

df %>% group_by(Task)  %>% 
         summarize(
             Count = nrow(df[df$Status == 'Completed',])
         ) 

Edit: Updated the question to add the actual dataset instead of a screenshot.

解决方案

You can filter the data based on the column, then do the count for task :

df <- data.frame(
  student = c(
    rep("A", 4), rep("B", 4), rep("C", 4), rep("D", 4)
  ), 
  task = rep(
    c("Home", "Class", "Assign", "Poster"), 4
  ), 
  res = sample(
    c("Completed", "Pending", "Not performed"), 
    16, TRUE
  )
) 

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
df %>% 
  filter(res == "Completed") %>%
  count(task)
#> # A tibble: 4 x 2
#>   task       n
#>   <fct>  <int>
#> 1 Assign     1
#> 2 Class      1
#> 3 Home       1
#> 4 Poster     3

Created on 2019-09-29 by the reprex package (v0.3.0)

这篇关于使用dplyr根据另一列中的字符串值将一列中的数据分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆