将组的最大值分配给该组中的所有行 [英] Assign max value of group to all rows in that group

查看：59 发布时间：2021/5/13 19:46:47 r group-by dplyr max

本文介绍了将组的最大值分配给该组中的所有行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想将一个组的最大值分配给该组内的所有行.我该怎么办?

I would like to assign the max value of a group to all rows within that group. How do I do that?

我有一个数据框，其中包含该组的名称和所属组的最大学分数.

I have a dataframe containing the names of the group and the max number of credits that belongs to it.

course_credits <- aggregate(bsc_academic$Credits, by = list(bsc_academic$Course_code), max)

给出

    Course    Credits
1   ABC1000  6.5
2   ABC1003  6.5
3   ABC1004  6.5
4   ABC1007  5.0
5   ABC1010  6.5
6   ABC1021  6.5
7   ABC1023  6.5

主数据框如下所示:

Appraisal.Type   Resits   Credits Course_code   Student_ID          
Final result       0       6.5    ABC1000           10                
Final result       0       6.5    ABC1003           10               
Grade supervisor   0       0      ABC1000           10               
Grade supervisor   0       0      ABC1003           10 
Final result       0       12     ABC1294           23   
Grade supervisor   0       0      ABC1294           23

如您所见，学生10修了ABC1000课程，价值6.5学分.但是，对于每门课程(每位学生)，都有两行:最终结果和年级主管.最后，应删除最终结果，但应保留功劳.因此，我想将最大值6.5分配给成绩主管"行.同样，学生23已修读ABC1294课程，价值12个学分.

As you see, student 10 took course ABC1000, worth 6.5 credits. For each course (per student), however, two rows exist: Final result and Grade supervisor. In the end, Final result should be deleted, but the credits should be kept. Therefore, I want to assign the max value of 6.5 to the Grade supervisor row. Likewise, student 23 has followed course ABC1294, worth 12 credits.

最后，应该是结果:

Appraisal.Type   Resits   Credits Course_code   Student_ID                      
Grade supervisor   0       6.5      ABC1000           10               
Grade supervisor   0       6.5      ABC1003           10    
Grade supervisor   0       12       ABC1294           23

我该怎么办?

推荐答案

一种选择是按'Student_ID'分组， mutate 将'Credits'的 max 'Credits'和 filter 具有"Appraisal.Type"作为"Grade Supervisor"的行

An option would be to group by 'Student_ID', mutate the 'Credits' with max of 'Credits' and filter the rows with 'Appraisal.Type' as "Grade supervisor"

library(dplyr)
df1 %>%
   group_by(Student_ID) %>%
   dplyr::mutate(Credits = max(Credits)) %>%
   ungroup %>%
   filter(Appraisal.Type == "Grade supervisor")
# A tibble: 2 x 5
#  Appraisal.Type   Resits Credits Course_code Student_ID
#  <chr>             <int>   <dbl> <chr>            <int>
#1 Grade supervisor      0     6.5 ABC1000             10
#2 Grade supervisor      0     6.5 ABC1003             10

如果我们还需要在分组中包含课程代码"

If we also need 'Course_code' to be included in the grouping

df2 %>%
  group_by(Student_ID, Course_code) %>% 
  dplyr::mutate(Credits = max(Credits)) %>%  
  filter(Appraisal.Type == "Grade supervisor")
# A tibble: 3 x 5
# Groups:   Student_ID, Course_code [3]
#  Appraisal.Type   Resits Credits Course_code Student_ID
#  <chr>             <int>   <dbl> <chr>            <int>
#1 Grade supervisor      0     6.5 ABC1000             10
#2 Grade supervisor      0     6.5 ABC1003             10
#3 Grade supervisor      0    12   ABC1294             23

注意:在这种情况下，还加载了 plyr 程序包，在 plyr 中也可能存在一些对功能esp summarise/mutate 的屏蔽.代码>.为了防止这种情况，请在不加载 plyr 的情况下在新会话中执行此操作，或者明确指定 dplyr :: mutate


NOTE: I case, plyr package is also loaded, there can be some masking of functions esp summarise/mutate which is also found in plyr.  To prevent it, either do this on a fresh session without loading plyr or explicitly specify dplyr::mutate
df1 <- structure(list(Appraisal.Type = c("Final result", "Final result", 
"Grade supervisor", "Grade supervisor"), Resits = c(0L, 0L, 0L, 
0L), Credits = c(6.5, 6.5, 0, 0), Course_code = c("ABC1000", 
"ABC1003", "ABC1000", "ABC1003"), Student_ID = c(10L, 10L, 10L, 
10L)), class = "data.frame", row.names = c(NA, -4L)) 



df2 <- structure(list(Appraisal.Type = c("Final result", "Final result", 
"Grade supervisor", "Grade supervisor", "Final result", "Grade supervisor"
), Resits = c(0L, 0L, 0L, 0L, 0L, 0L), Credits = c(6.5, 6.5, 
0, 0, 12, 0), Course_code = c("ABC1000", "ABC1003", "ABC1000", 
"ABC1003", "ABC1294", "ABC1294"), Student_ID = c(10L, 10L, 10L, 
10L, 23L, 23L)), class = "data.frame", row.names = c(NA, -6L))


                        这篇关于将组的最大值分配给该组中的所有行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

将组的最大值分配给该组中的所有行 [英] Assign max value of group to all rows in that group

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

将组的最大值分配给该组中的所有行 [英] Assign max value of group to all rows in that group

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭