通过传递带有列名的有序向量来动态排序dplyr中的列 [英] Dynamically sorting columns in dplyr via passing ordered vector with column names to select

查看:84
本文介绍了通过传递带有列名的有序向量来动态排序dplyr中的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用下面的代码生成一个简单的摘要表:

I'm using the code below to generate a simple summary table:

# Data
data("mtcars")
# Lib
require(dplyr)
# Summary
mt_sum <- mtcars %>%
  group_by(am) %>%
  summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
  mutate(am = as.character(am)) %>%
  left_join(y = as.data.frame(table(mtcars$am),
                              stringsAsFactors = FALSE),
            by = c("am" = "Var1")) 

代码产生预期的结果:

> head(mt_sum)
Source: local data frame [2 x 10]

     am mpg_min cyl_min mpg_mean cyl_mean mpg_median cyl_median mpg_max cyl_max  Freq
  (chr)   (dbl)   (dbl)    (dbl)    (dbl)      (dbl)      (dbl)   (dbl)   (dbl) (int)
1     0    10.4       4 17.14737 6.947368       17.3          8    24.4       8    19
2     1    15.0       4 24.39231 5.076923       22.8          4    33.9       8    13

但是,我对列的排序方式不满意.特别是,我想:

However, I'm not satisfied with the way the columns are ordered. In particular, I would like to:

  1. 按名称排序列

  1. Order columns by name

通过dplyr

所需订单

所需顺序如下:

Desired order

The desired order would look like that:

> names(mt_sum)[order(names(mt_sum))]
 [1] "am"         "cyl_max"    "cyl_mean"   "cyl_median" "cyl_min"    "Freq"       "mpg_max"   
 [8] "mpg_mean"   "mpg_median" "mpg_min" 

尝试

理想情况下,我想通过names(mt_sum)[order(names(mt_sum))]select()中的列进行排序的方式.但是代码:

Attempts

Ideally, I would like to pass names(mt_sum)[order(names(mt_sum))] way of sorting the columns in select(). But the code:

mt_sum <- mtcars %>%
  group_by(am) %>%
  summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
  mutate(am = as.character(am)) %>%
  left_join(y = as.data.frame(table(mtcars$am),
                              stringsAsFactors = FALSE),
            by = c("am" = "Var1")) %>%
  select(names(.)[order(names(.))])

将返回预期的错误:

Error: All select() inputs must resolve to integer column positions.
The following do not:
*  names(.)[order(names(.))]

在我的真实数据中,我正在生成大量的摘要列.因此,我的问题是,如何动态地将排序后的列名传递给dplyr中的select(),以便它可以理解并应用于手头的data.frame?

In my real data I'm generating a vast number of summary columns. Hence my question, how can I dynamically pass sorted column names to select() in dplyr so it will understand it and apply to the data.frame at Hand?

我的重点是想办法将动态生成的列名传递给select().我知道我可以按照base中的列或键入名称进行排序" >这里.

My focus is on figuring out a way of passing the dynamically generated column names to select(). I know that I could sort the columns in base or by typing names, as discussed here.

推荐答案

您肯定在正确的道路上.

You're definitely on the right path.

mt_sum <- mtcars %>%
  group_by(am) %>%
  summarise_each(funs(min, mean, median, max), mpg, cyl) %>%
  mutate(am = as.character(am)) %>%
  left_join(y = as.data.frame(table(mtcars$am),
                              stringsAsFactors = FALSE),
            by = c("am" = "Var1")) %>%
  .[, names(.)[order(names(.))]]

这篇关于通过传递带有列名的有序向量来动态排序dplyr中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆