使用dplyr根据另一列中的字符串值将一列中的数据分组 [英] Group data in one column based on a string value in another column using dplyr
本文介绍了使用dplyr根据另一列中的字符串值将一列中的数据分组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在电子表格中有以下数据,其中列出了为学生分配的任务。
df<-数据.frame(
$我想将数据分组到任务级别,并根据已完成的状态查找每个任务的计数。以下是我的预期输出
Student = c( A, A, A, A, B, B, B, C, D, D , D, D),
Task = c(家庭作业,课堂作业,作业,海报,海报,作业,作业,作业, 班级,作业,任务,海报),
Status = c(已完成,待处理,未执行,未执行,已完成,未执行 ,未执行,已完成,已完成,待处理,待处理,待处理),
stringsAsFactors = FALSE)
我使用了以下代码段,但它似乎不起作用。
df%>%group_by(Task)%&%;%
summary(
Count = nrow(df [df $ Status =='Completed',])
)
编辑:更新了问题,以添加实际数据集而不是屏幕快照。
解决方案您可以根据该列过滤数据,然后执行任务计数:
df<-data.frame(
学生= c(
rep( A,4),rep( B,4),rep( C, 4),rep( D,4)
),
task = rep(
c( Home, Class, Assign, Poster),4
),
res =样本(
c(已完成,待处理,未执行),
16,TRUE
)
)
库(dplyr)
#>
#>附加软件包:‘dplyr’
#>下列对象被 package:stats屏蔽:
#>
#>过滤器,滞后
#>下列对象从 package:base中被屏蔽:
#>
#>相交,setdiff,setequal,联合
df%&%;%
filter(res == Completed)%>%
count(task)
#> #小动作:4 x 2
#>任务n
#> < fct> < int>
#> 1分配1
#> 2类1
#> 3家1
#> 4海报3
创建于2019-09-29由 reprex包(v0.3.0)
I have the below data in a spreadsheet where the tasks assigned for the students are listed.
df <- data.frame( Student=c("A","A","A","A","B","B","B","C","D","D","D","D"), Task=c("Homework","Classwork","Assignment","Poster","Poster","Homework","Assignment","Homework","Classwork","Homework","Assignment","Poster"), Status=c("Completed","Pending","Not performed","Not performed","Completed","Not performed","Not performed","Completed","Completed","Pending","Pending","Pending"), stringsAsFactors = FALSE)
I would like to group the data at task level and find the count for each task based on 'Status' being 'Completed'. Below is my expected output
I used the below snippet but it does not seem to work. Any help is appreciated.
df %>% group_by(Task) %>% summarize( Count = nrow(df[df$Status == 'Completed',]) )
Edit: Updated the question to add the actual dataset instead of a screenshot.
解决方案You can filter the data based on the column, then do the count for task :
df <- data.frame( student = c( rep("A", 4), rep("B", 4), rep("C", 4), rep("D", 4) ), task = rep( c("Home", "Class", "Assign", "Poster"), 4 ), res = sample( c("Completed", "Pending", "Not performed"), 16, TRUE ) ) library(dplyr) #> #> Attaching package: 'dplyr' #> The following objects are masked from 'package:stats': #> #> filter, lag #> The following objects are masked from 'package:base': #> #> intersect, setdiff, setequal, union df %>% filter(res == "Completed") %>% count(task) #> # A tibble: 4 x 2 #> task n #> <fct> <int> #> 1 Assign 1 #> 2 Class 1 #> 3 Home 1 #> 4 Poster 3
Created on 2019-09-29 by the reprex package (v0.3.0)
这篇关于使用dplyr根据另一列中的字符串值将一列中的数据分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文