按组变量排列 grouped_df 不起作用 [英] Arrange a grouped_df by group variable not working

查看：14 发布时间：2021/12/23 12:34:47 r dplyr grouped-table

本文介绍了按组变量排列 grouped_df 不起作用的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个 data.frame，其中包含客户姓名、年份和每年的几个收入数字.

I have a data.frame that contains client names, years, and several revenue numbers from each year.

df <- data.frame(client = rep(c("Client A","Client B", "Client C"),3), 
                 year = rep(c(2014,2013,2012), each=3), 
                 rev = rep(c(10,20,30),3)
                )

我想最终得到一个按客户和年份汇总收入的 data.frame.然后我想按年份对 data.frame 进行排序，然后按收入降序排序.

I want to end up with a data.frame that aggregates the revenue by client and year. I then want to sort the data.frame by year then by descending revenue.

library(dplyr)
df1 <- df %>% 
        group_by(client, year) %>%
        summarise(tot = sum(rev)) %>%
        arrange(year, desc(tot))

然而，当使用上面的代码时，arrange() 函数根本不会改变分组的 data.frame 的顺序.当我运行以下代码并强制转换为正常的 data.frame 时，它可以工作.

However, when using the code above the arrange() function doesn't change the order of the grouped data.frame at all. When I run the below code and coerce to a normal data.frame it works.

   library(dplyr)
    df1 <- df %>% 
            group_by(client, year) %>%
            summarise(tot = sum(rev)) %>%
            data.frame() %>%
            arrange(year, desc(tot))

我是不是遗漏了什么，还是每次尝试按分组变量排列 grouped_df 时都需要这样做?

Am I missing something or will I need to do this every time when trying to arrange a grouped_df by a grouped variable?

R 版本:3.1.1dplyr 包版本:0.3.0.2

R Version: 3.1.1 dplyr package version: 0.3.0.2

编辑 11/13/2017:正如 lucacerone 所指出的，从 dplyr 0.5 开始，在排序时再次安排忽略组.所以我的原始代码现在按照我最初预期的方式工作.

EDIT 11/13/2017: As noted by lucacerone, beginning with dplyr 0.5, arrange once again ignores groups when sorting. So my original code now works in the way I initially expected it would.

arrange() 再次忽略分组，恢复到 dplyr 0.3 及更早版本的行为.这使得arrange() 与其他dplyr 动词不一致，但我认为这种行为通常更有用.无论如何，它不会再次改变，因为更多的改变只会引起更多的混乱.

arrange() once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes arrange() inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it’s not going to change again, as more changes will just cause more confusion.

推荐答案

尝试切换 group_by 语句的顺序:

Try switching the order of your group_by statement:

df %>% 
  group_by(year, client) %>%
  summarise(tot = sum(rev)) %>%
  arrange(year, desc(tot))

我认为 arrange 是在组内排序；在 summarize 之后，最后一个组被删除，所以这意味着在您的第一个示例中，它在 client 组中排列行.将顺序切换为 group_by(year, client) 似乎可以解决这个问题，因为 client 组在 summarize 后被删除.

I think arrange is ordering within groups; after summarize, the last group is dropped, so this means in your first example it's arranging rows within the client group. Switching the order to group_by(year, client) seems to fix it because the client group gets dropped after summarize.

或者，还有 ungroup() 函数

df %>% 
  group_by(client, year) %>%
  summarise(tot = sum(rev)) %>%
  ungroup() %>%
  arrange(year, desc(tot))

<小时>

编辑，@lucacerone:因为 dplyr 0.5 这不再起作用:

Edit, @lucacerone: since dplyr 0.5 this does not work anymore:

破坏性更改安排()再次忽略分组，恢复原状dplyr 0.3 及更早版本的行为.这使得安排()与其他 dplyr 动词不一致，但我认为这种行为是一般比较有用.无论如何，它不会再次改变，因为更多的变化只会引起更多的混乱.

Breaking changes arrange() once again ignores grouping, reverting back to the behaviour of dplyr 0.3 and earlier. This makes arrange() inconsistent with other dplyr verbs, but I think this behaviour is generally more useful. Regardless, it’s not going to change again, as more changes will just cause more confusion.

这篇关于按组变量排列 grouped_df 不起作用的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

按组变量排列 grouped_df 不起作用 [英] Arrange a grouped_df by group variable not working

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

按组变量排列 grouped_df 不起作用 [英] Arrange a grouped_df by group variable not working

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭