当您具有colnames的字符向量时,如何不使用select()dplyr选择列? [英] How NOT to select columns using select() dplyr when you have character vector of colnames?

查看:59
本文介绍了当您具有colnames的字符向量时,如何不使用select()dplyr选择列?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用dplyr取消选择数据集中的列,但自昨晚以来一直无法实现。

I am trying to unselect columns in my dataset using dplyr, but I am not able to achieve that since last night.

我很清楚周围的工作,但是我正在严格尝试通过dplyr查找答案。

I am well aware of work around but I am being strictly trying to find answer just through dplyr.

library(dplyr)
df <- tibble(x = c(1,2,3,4), y = c('a','b','c','d'))
df %>% select(-c('x'))

给我一​​个错误:-c( x)错误:一元运算符

Gives me an error : Error in -c("x") : invalid argument to unary operator

现在,我知道select接受未加引号的值,但是我无法以这种方式进行子选择。

Now, I know that select takes in unquoted values but I am not able to sub-select in this fashion.

请注意,上面的数据集只是一个示例,我们可以有很多列。

Please note the above dataset is just an example, we can have many columns.

谢谢,

Prerit

推荐答案

编辑:OP的实际问题是关于如何使用字符向量从数据框中选择或取消选择列。为此使用 one_of()辅助函数:

OP's actual question was about how to use a character vector to select or deselect columns from a dataframe. Use the one_of() helper function for that:

colnames(iris)

# [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"

cols <- c("Petal.Length", "Sepal.Length")

select(iris, one_of(cols)) %>% colnames

# [1] "Petal.Length" "Sepal.Length"

select(iris, -one_of(cols)) %>% colnames

# [1] "Sepal.Width" "Petal.Width" "Species"

您应该查看选择的辅助对象(键入 ?select_helpers ),因为它们非常有用。从文档中:

You should have a look at the select helpers (type ?select_helpers) because they're incredibly useful. From the docs:

starts_with():以前缀开头

ends_with():以前缀结尾

contains() :包含文字字符串

matches():匹配正则表达式

matches(): matches a regular expression

num_range():一个数值范围,例如x01,x02,x03。

num_range(): a numerical range like x01, x02, x03.

one_of()字符向量中的变量。

everything():所有变量。

给出带有列的数据框命名a:z,使用 select 像这样:

Given a dataframe with columns names a:z, use select like this:

select(-a, -b, -c, -d, -e)

# OR

select(-c(a, b, c, d, e))

# OR

select(-(a:e))

# OR if you want to keep b

select(-a, -(c:e))

# OR a different way to keep b, by just putting it back in

select(-(a:e), b)

所以我想省略两个t他来自 iris 数据集的列,我可以说:

So if I wanted to omit two of the columns from the iris dataset, I could say:

colnames(iris)

# [1] "Sepal.Length" "Sepal.Width"  "Petal.Length" "Petal.Width"  "Species"

select(iris, -c(Sepal.Length, Petal.Length)) %>% colnames()

# [1] "Sepal.Width" "Petal.Width" "Species" 

当然,最好的和最简洁的方法是使用 select 的帮助函数:

But of course, the best and most concise way to achieve that is using one of select's helper functions:

select(iris, -ends_with(".Length")) %>% colnames()

# [1] "Sepal.Width" "Petal.Width" "Species"   

PS您将引用的值传递给 dplyr 很奇怪,它的一大优点是您不必总是在所有输入时间。如您所见,裸值与 dplyr ggplot2 可以很好地工作。

P.S. It's weird that you are passing quoted values to dplyr, one of its big niceties is that you don't have to keep typing out quotes all the time. As you can see, bare values work fine with dplyr and ggplot2.

这篇关于当您具有colnames的字符向量时,如何不使用select()dplyr选择列?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆