访问"mutate_at"中的列名,以将其用于列表的子集 [英] Access the column names in the `mutate_at` to use it for subseting a list
问题描述
我正在尝试重新编码几个变量,但是使用不同的重新编码方案.重新编码方案保存在一个列表中,其中每个元素都是 old = new
形式的命名矢量.每个元素都是数据帧中每个变量的重新编码方案
I am trying to recode several variables but with different recode schemes.
The recoding scheme is saved in a list where each element is a named vector of the form old = new
.
Each element is the recoding scheme for each variable in the data frame
我正在使用 mutate_at
函数和 recode
.
我认为问题在于我无法提取变量名来使用它从列表中获取正确的编码方案
I think that the problem is that I cannot extract the variable name to use it to get the correct recoding scheme from the list
我在这没有帮助
我还看到了此处,我可以提取列名与tidyevalution一起传递的变量,但我再次未能实现它.(也正在使用不推荐使用的"funs")
Also I saw here that I can extract the column name of the variable that is passed with tidyevalution but I again failed to implement it. (also it is using the deprecated 'funs`)
最后,我希望这是重新编码变量的正确方法(即在mutate中使用此重新编码列表).如果有完全不同的方法来处理这种多重编码,请告诉我
Last, I am hoping that this is the correct approach to recode the variables (i.e. using this recode list inside the mutate). If there is totally different way to approach this multiple recoding please let me know
library(dplyr)
# dplyr version 0.8.5
df <-
tibble(
var1 = c("A", "A", "B", "C"),
var2 = c("X", "Y", "Z", "Z")
)
recode_list <-
list(
var1 = c(A = 1, B = 2, C = 3),
var2 = c(X = 0, Y = -1, Z = 1)
)
recode_list
#> $var1
#> A B C
#> 1 2 3
#>
#> $var2
#> X Y Z
#> 0 -1 1
我正在使用 dplyr :: recode
函数.
# recoding works fine when doing it one variable as a time
df %>%
mutate(
var1 = recode(var1, !!!recode_list[["var1"]]),
var2 = recode(var2, !!!recode_list[["var2"]])
)
#> # A tibble: 4 x 2
#> var1 var2
#> <dbl> <dbl>
#> 1 1 0
#> 2 1 -1
#> 3 2 1
#> 4 3 1
当我尝试应用函数对所有变量执行此操作时,似乎失败了
When I try to apply a function to do this for all variables, it seems to fail
# this does not work.
df %>%
mutate_at(vars(var1, var2), ~{
var_name <- rlang::quo_name(quo(.))
recode(., !!!recode_list[[var_name]])
}
)
#> Error in expr_interp(f): object 'var_name' not found
我也尝试了 rlang :: as_name
和 rlang :: as_label
,但我认为我无法真正捕获变量名称作为字符串来使用它来对变量进行子集化 recode_list
.
I also tried rlang::as_name
and rlang::as_label
but I think I cannot really capture the name of the variable as a string to use it to subset the recode_list
.
df %>%
mutate_at(vars(var1, var2), ~ {
var_name <- rlang::as_name(quo(.))
print(var_name)
#recode(., !!!recode_list[["var2"]])
}
)
#> [1] "."
#> [1] "."
#> # A tibble: 4 x 2
#> var1 var2
#> <chr> <chr>
#> 1 . .
#> 2 . .
#> 3 . .
#> 4 . .
Created on 2020-04-30 by the reprex package (v0.3.0)
推荐答案
这对您有用吗?
library(dplyr)
library(rlang)
df %>%
mutate_at(vars(var1,var2),
.funs = function(x){recode_list %<>% .[[as_label(enquo(x))]]
recode(x,!!!recode_list)})
## A tibble: 4 x 2
# var1 var2
# <dbl> <dbl>
#1 1 0
#2 1 -1
#3 2 1
#4 3 1
我怀疑将子集 recode_list
直接放入 recode
时这可行是不是因为 enquo
延迟了对 x 的评估
,直到分配%<>%
.然后 !!!
可以在先前经过正确评估之后强制进行评估.
I suspect this works while placing the subset recode_list
directly into recode
does not is because enquo
delays evaluation of x
until assignment with %<>%
. Then !!!
can force evaluation after it has been properly evaluated previously.
修改
您使用 rlang
的方法也可以进行一些修改:
Your approach with rlang
also works with some modifications:
library(rlang)
df %>%
mutate_at(vars(var1, var2), function(x) {
var_name <- rlang::as_label(substitute(x))
recode(x, !!!recode_list[[var_name]])
})
这篇关于访问"mutate_at"中的列名,以将其用于列表的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!