Dplyr - 按组变量排列一个grouping_df [英] Dplyr - Arrange a grouped_df by group variable not working

查看:136
本文介绍了Dplyr - 按组变量排列一个grouping_df的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个data.frame,其中包含客户名称,年份和每年的几个收入。

I have a data.frame that contains client names, years, and several revenue numbers from each year.

df <- data.frame(client = rep(c("Client A","Client B", "Client C"),3), 
                 year = rep(c(2014,2013,2012), each=3), 
                 rev = rep(c(10,20,30),3)
                )

我想要结束一个数据框架,用于统计客户和年度的收入。然后我想按年份排序data.frame,然后按收入下降。

I want to end up with a data.frame that aggregates the revenue by client and year. I then want to sort the data.frame by year then by descending revenue.

library(dplyr)
df1 <- df %>% 
        group_by(client, year) %>%
        summarise(tot = sum(rev)) %>%
        arrange(year, desc(tot))

然而,当使用上面的代码 arrange()函数根本不改变分组的数据框架的顺序。当我运行下面的代码并强制到一个正常的数据框架它的工作。

However, when using the code above the arrange() function doesn't change the order of the grouped data.frame at all. When I run the below code and coerce to a normal data.frame it works.

   library(dplyr)
    df1 <- df %>% 
            group_by(client, year) %>%
            summarise(tot = sum(rev)) %>%
            data.frame() %>%
            arrange(year, desc(tot))

每次尝试通过分组变量分组_df时,我需要这样做?

Am I missing something or will I need to do this every time when trying to arrange a grouped_df by a grouped variable?

R版本:3.1 .1
dplyr包版本:0.3.0.2

R Version: 3.1.1 dplyr package version: 0.3.0.2

推荐答案

尝试切换您的 group_by 语句:

df %>% 
  group_by(year, client) %>%
  summarise(tot = sum(rev)) %>%
  arrange(year, desc(tot))

我认为安排是在组内订购;在总结之后,最后一个组被删除,所以这意味着在您的第一个例子中,它正在客户端组中排列。将订单切换到 group_by(年,客户端)似乎解决了它,因为客户端组在总结

I think arrange is ordering within groups; after summarize, the last group is dropped, so this means in your first example it's arranging rows within the client group. Switching the order to group_by(year, client) seems to fix it because the client group gets dropped after summarize.

或者,还有 ungroup() function

Alternatively, there is the ungroup() function

df %>% 
  group_by(client, year) %>%
  summarise(tot = sum(rev)) %>%
  ungroup() %>%
  arrange(year, desc(tot))

这篇关于Dplyr - 按组变量排列一个grouping_df的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆