按组变量排列grouped_df不起作用 [英] Arrange a grouped_df by group variable not working

查看：132 发布时间：2020/7/23 4:19:46 r dplyr grouped-table

本文介绍了按组变量排列grouped_df不起作用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个data.frame，其中包含客户名称，年份和每年的多个收入数字.

I have a data.frame that contains client names, years, and several revenue numbers from each year.

df <- data.frame(client = rep(c("Client A","Client B", "Client C"),3), 
                 year = rep(c(2014,2013,2012), each=3), 
                 rev = rep(c(10,20,30),3)
                )

我想以一个data.frame结尾，该框架按客户和年份汇总收入.然后，我想按年份对data.frame进行排序，然后按收入递减.

I want to end up with a data.frame that aggregates the revenue by client and year. I then want to sort the data.frame by year then by descending revenue.

library(dplyr)
df1 <- df %>% 
        group_by(client, year) %>%
        summarise(tot = sum(rev)) %>%
        arrange(year, desc(tot))

但是，使用arrange()函数上方的代码时，根本不会更改分组的data.frame的顺序.当我运行以下代码并将其强制转换为正常的data.frame时，它就会起作用.

However, when using the code above the arrange() function doesn't change the order of the grouped data.frame at all. When I run the below code and coerce to a normal data.frame it works.

   library(dplyr)
    df1 <- df %>% 
            group_by(client, year) %>%
            summarise(tot = sum(rev)) %>%
            data.frame() %>%
            arrange(year, desc(tot))

我是否缺少某些东西?还是每次尝试通过分组变量arrange grouped_df时都需要这样做吗?

Am I missing something or will I need to do this every time when trying to arrange a grouped_df by a grouped variable?

R版本:3.1.1 dplyr软件包版本:0.3.0.2

R Version: 3.1.1 dplyr package version: 0.3.0.2

编辑11/13/2017: 如 lucacerone 所述，从dplyr 0.5开始，排序时再次忽略组.因此，我的原始代码现在可以按照我最初预期的方式工作.

EDIT 11/13/2017: As noted by lucacerone, beginning with dplyr 0.5, arrange once again ignores groups when sorting. So my original code now works in the way I initially expected it would.

arrange()再次忽略分组，恢复为dplyr 0.3及更早版本的行为.这使ranging()与其他dplyr动词不一致，但我认为这种行为通常更有用.无论如何，它不会再改变，因为更多的改变只会引起更多的混乱.

arrange() once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes arrange() inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it’s not going to change again, as more changes will just cause more confusion.

推荐答案

尝试切换group_by语句的顺序:

df %>% 
  group_by(year, client) %>%
  summarise(tot = sum(rev)) %>%
  arrange(year, desc(tot))

我认为arrange正在组内排序；在summarize之后，最后一个组被删除，因此这意味着在您的第一个示例中，它在client组中排列行.将顺序切换为group_by(year, client)似乎可以解决问题，因为client组在summarize之后被删除了.

I think arrange is ordering within groups; after summarize, the last group is dropped, so this means in your first example it's arranging rows within the client group. Switching the order to group_by(year, client) seems to fix it because the client group gets dropped after summarize.

或者，有ungroup()函数

df %>% 
  group_by(client, year) %>%
  summarise(tot = sum(rev)) %>%
  ungroup() %>%
  arrange(year, desc(tot))

编辑，@ lucacerone:，因为dplyr 0.5不再起作用:

Edit, @lucacerone: since dplyr 0.5 this does not work anymore:

突破性的更改Arrange()再次忽略分组，恢复原状 dplyr 0.3及更早版本的行为.这使得ranging() 与其他dplyr动词不一致，但我认为这种行为是通常比较有用.无论如何，它不会再改变，因为更多更改只会引起更多混乱.

Breaking changes arrange() once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes arrange() inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it’s not going to change again, as more changes will just cause more confusion.

这篇关于按组变量排列grouped_df不起作用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

按组变量排列grouped_df不起作用 [英] Arrange a grouped_df by group variable not working

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

按组变量排列grouped_df不起作用 [英] Arrange a grouped_df by group variable not working

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭