按字符列名称过滤数据框(在 dplyr 中) [英] Filter data frame by character column name (in dplyr)

查看:20
本文介绍了按字符列名称过滤数据框(在 dplyr 中)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,想以两种方式之一过滤它,通过this"列或that"列.我希望能够将列名作为变量引用.如何(在 dplyr 中,如果这有所不同)我如何通过变量引用列名?

I have a data frame and want to filter it in one of two ways, by either column "this" or column "that". I would like to be able to refer to the column name as a variable. How (in dplyr, if that makes a difference) do I refer to a column name by a variable?

library(dplyr)
df <- data.frame(this = c(1, 2, 2), that = c(1, 1, 2))
df
#   this that
# 1    1    1
# 2    2    1
# 3    2    2
df %>% filter(this == 1)
#   this that
# 1    1    1

但是假设我想使用变量 column 来保存this"或that",并过滤 column 的任何值.as.symbolget 都适用于其他上下文,但不是这样:

But say I want to use the variable column to hold either "this" or "that", and filter on whatever the value of column is. Both as.symbol and get work in other contexts, but not this:

column <- "this"
df %>% filter(as.symbol(column) == 1)
# [1] this that
# <0 rows> (or 0-length row.names)
df %>% filter(get(column) == 1)
# Error in get("this") : object 'this' not found

如何将column的值变成列名?

推荐答案

来自 currentdplyr 文档(我强调):

dplyr 用于提供每个动词的双版本,后缀为下划线.这些版本具有标准评估 (SE) 语义:它们不像 NSE 动词那样按代码获取参数,而是按值获取参数.他们的目的是使使用 dplyr 编程成为可能.但是,dplyr 现在使用整洁的评估语义.NSE 动词仍然捕获它们的参数,但是您现在可以取消引用这些参数的一部分.这为 NSE 动词提供了完整的可编程性.因此,下划线版本现在是多余的.

dplyr used to offer twin versions of each verb suffixed with an underscore. These versions had standard evaluation (SE) semantics: rather than taking arguments by code, like NSE verbs, they took arguments by value. Their purpose was to make it possible to program with dplyr. However, dplyr now uses tidy evaluation semantics. NSE verbs still capture their arguments, but you can now unquote parts of these arguments. This offers full programmability with NSE verbs. Thus, the underscored versions are now superfluous.

所以,基本上我们需要执行两个步骤才能在dplyr::filter中引用变量column的值"this"():

So, essentially we need to perform two steps to be able to refer to the value "this" of the variable column inside dplyr::filter():

  1. 我们需要将字符类型的变量column转成符号.

使用基础 R 可以通过函数 as.symbol() 来实现这是 as.name() 的别名.前者是首选tidyverse 开发者因为它

Using base R this can be achieved by the function as.symbol() which is an alias for as.name(). The former is preferred by the tidyverse developers because it

遵循更现代的术语(R 类型而不是 S 模式).

follows a more modern terminology (R types instead of S modes).

或者,同样可以通过 rlang::sym() 实现 来自 tidyverse.

Alternatively, the same can be achieved by rlang::sym() from the tidyverse.

我们需要将 1) 中的符号注入 dplyr::filter() 表达式.

We need to inject the symbol from 1) into the dplyr::filter() expression.

这是由所谓的注入运算符 !! 基本上是 句法糖允许在 R 评估之前修改一段代码.

This is done by the so called injection operator !! which is basically syntactic sugar allowing to modify a piece of code before R evaluates it.

(在早期版本的 dplyr(或分别是底层的 rlang)曾经有过这样的情况(包括你的)!! 会与单个 ! 发生冲突,但这不再是问题 因为 !! 获得了正确的运算符优先级.)

(In earlier versions of dplyr (or the underlying rlang respectively) there used to be situations (incl. yours) where !! would collide with the single !, but this is not an issue anymore since !! gained the right operator precedence.)

应用于您的示例:

library(dplyr)
df <- data.frame(this = c(1, 2, 2),
                 that = c(1, 1, 2))
column <- "this"

df %>% filter(!!as.symbol(column) == 1)
#   this that
# 1    1    1

这篇关于按字符列名称过滤数据框(在 dplyr 中)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆