使用 dplyr::select() 以数字作为名称选择多列 [英] Select multiple columns with dplyr::select() with numbers as names

查看:35
本文介绍了使用 dplyr::select() 以数字作为名称选择多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下数据框:

a <- runif(10)
dd <- as.data.frame(t(a))
names(dd) <- c("ID", "a", "a2", "b", "b2", "f", "XXX", "1", "4", "8")

dplyr 中,有一个很好的方法来选择多列.例如,要选择列 a 和列 f 之间的列,我可以使用

In dplyr, there is a nice way to select a number of columns. For example, to select the columns between column a and column f, I can use

dd %>% dplyr::select(a:f)

在我的问题中,数据框最后一部分的列可能会有所不同,但它们的名称始终为 1 到 99 之间的数字.但是,我似乎无法执行与上述相同的技巧:

In my problem, the columns of the last part of the data frame may vary, yet they always have as name a number between 1 and 99. However, I can not seem to be able to do the same trick as above:

> dd %>% select(1:99)
Error: Position must be between 0 and n
> dd %>% select("1":"99")
Error: Position must be between 0 and n

这是因为使用 select() 尝试以这种方式按位置选择列.

Which is because using select() tries to select columns by position in this way.

我希望能够获得一个数据框,其中所有列都在 af 之间,并且标签是1 之间的数字强> 和 99.使用 select() 可以一次性完成吗?

I would like to be able to obtain a data frame with all columns between a and f, and those with labels that are numbers between 1 and 99. Is that possible to do in one go with select()?

推荐答案

以数字开头的列名,例如数据中的1"和8",不是语法上有效的名称(见 ?make.names).然后查看 ?Quoutes 中的名称和标识符"部分:可以使用其他 [语法无效] 名称,前提是它们被引用.首选引号是反引号".

Column names starting with a number, such as "1" and "8" in your data, are not syntactically valid names (see ?make.names). Then see the 'Names and Identifiers' section in ?Quoutes: "other [syntactically invalid] names can be used provided they are quoted. The preferred quote is the backtick".

因此,将无效的列名用反引号括起来 (`):

Thus, wrap the invalid column names in backticks (`):

dd %>% dplyr::select(a:f, `1`:`8`)

#           a        a2         b        b2          f         1         4         8
# 1 0.2510023 0.4109819 0.6787226 0.4974859 0.01828614 0.7449878 0.1648462 0.5875638

另一种选择是使用selectselect_的SE版本:

Another option is to use the SE-version of select, select_:

dd %>% dplyr::select_(.dots = c("a", "a2", ..., "1", "4", "8"))

这篇关于使用 dplyr::select() 以数字作为名称选择多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆