使用dplyr :: select()选择多个列,其中数字为名称 [英] Select multiple columns with dplyr::select() with numbers as names

查看:701
本文介绍了使用dplyr :: select()选择多个列,其中数字为名称的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下数据框:

  a<  -  runif(10)
dd< ; - as.data.frame(t(a))
名称(dd)< - c(ID,a,a2,b,b2,f XXX,1,4,8)

dplyr ,有一个很好的方式来选择一些列。例如,要选择列 a 和列 f 之间的列,我可以使用

 code> dd%>%dplyr :: select(a:f)

在我的问题,数据框架的最后一部分的列可能会有所不同,但是它们的名称总是在1到99之间。然而,我似乎无法做同样的技巧:

 > dd%>%select(1:99)
错误:位置必须介于0和n
之间dd%>%select(1:99)
错误:位置必须在0和n
之间

这是因为使用 select()以这种方式尝试按位置选择列。



我希望能够获得一个包含 a f 之间的所有列的数据框,标签是 1 99 之间的数字。是否可以一起执行 select()

解决方案

以数字开头的列名称,如数据中的1和8,不是语法有效的名称(见?make.names )。然后,请参阅?Quoutes 中的名称和标识符部分:其他[语法无效]名称可以被使用,只要它们被引用,首选引用是反引号。 / p>

因此,在反引号(`)中包装无效的列名称:

  dd%>%dplyr :: select(a:f,`1`:`8`)

#a a2 b b2 f 1 4 8
#1 0.2510023 0.4109819 0.6787226 0.4974859 0.01828614 0.7449878 0.1648462 0.5875638

另一个选项是使用SE版本的选择 select _

  dd%>%dplyr :: select _(。dots = c(a,a2,...,1,4,8)) 


Let's say I have the following data frame:

a <- runif(10)
dd <- as.data.frame(t(a))
names(dd) <- c("ID", "a", "a2", "b", "b2", "f", "XXX", "1", "4", "8")

In dplyr, there is a nice way to select a number of columns. For example, to select the columns between column a and column f, I can use

dd %>% dplyr::select(a:f)

In my problem, the columns of the last part of the data frame may vary, yet they always have as name a number between 1 and 99. However, I can not seem to be able to do the same trick as above:

> dd %>% select(1:99)
Error: Position must be between 0 and n
> dd %>% select("1":"99")
Error: Position must be between 0 and n

Which is because using select() tries to select columns by position in this way.

I would like to be able to obtain a data frame with all columns between a and f, and those with labels that are numbers between 1 and 99. Is that possible to do in one go with select()?

解决方案

Column names starting with a number, such as "1" and "8" in your data, are not syntactically valid names (see ?make.names). Then see the 'Names and Identifiers' section in ?Quoutes: "other [syntactically invalid] names can be used provided they are quoted. The preferred quote is the backtick".

Thus, wrap the invalid column names in backticks (`):

dd %>% dplyr::select(a:f, `1`:`8`)

#           a        a2         b        b2          f         1         4         8
# 1 0.2510023 0.4109819 0.6787226 0.4974859 0.01828614 0.7449878 0.1648462 0.5875638

Another option is to use the SE-version of select, select_:

dd %>% dplyr::select_(.dots = c("a", "a2", ..., "1", "4", "8"))

这篇关于使用dplyr :: select()选择多个列,其中数字为名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆