访问"mutate_at"中的列名,以将其用于列表的子集 [英] Access the column names in the `mutate_at` to use it for subseting a list

查看:43
本文介绍了访问"mutate_at"中的列名,以将其用于列表的子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试重新编码几个变量,但是使用不同的重新编码方案.重新编码方案保存在一个列表中,其中每个元素都是 old = new 形式的命名矢量.每个元素都是数据帧中每个变量的重新编码方案

I am trying to recode several variables but with different recode schemes. The recoding scheme is saved in a list where each element is a named vector of the form old = new. Each element is the recoding scheme for each variable in the data frame

我正在使用 mutate_at 函数和 recode .

我认为问题在于我无法提取变量名来使用它从列表中获取正确的编码方案

I think that the problem is that I cannot extract the variable name to use it to get the correct recoding scheme from the list

我在没有帮助

我还看到了此处,我可以提取列名与tidyevalution一起传递的变量,但我再次未能实现它.(也正在使用不推荐使用的"funs")

Also I saw here that I can extract the column name of the variable that is passed with tidyevalution but I again failed to implement it. (also it is using the deprecated 'funs`)

最后,我希望这是重新编码变量的正确方法(即在mutate中使用此重新编码列表).如果有完全不同的方法来处理这种多重编码,请告诉我

Last, I am hoping that this is the correct approach to recode the variables (i.e. using this recode list inside the mutate). If there is totally different way to approach this multiple recoding please let me know

library(dplyr)
# dplyr version 0.8.5

df <- 
  tibble(
    var1 = c("A", "A", "B", "C"),
    var2 = c("X", "Y", "Z", "Z")
  )

recode_list <- 
  list(

    var1 = c(A = 1, B = 2, C = 3),
    var2 = c(X = 0, Y = -1, Z = 1)
  )

recode_list
#> $var1
#> A B C 
#> 1 2 3 
#> 
#> $var2
#>  X  Y  Z 
#>  0 -1  1

我正在使用 dplyr :: recode 函数.


# recoding works fine when doing it one variable as a time
df %>% 
  mutate(
    var1 = recode(var1, !!!recode_list[["var1"]]),
    var2 = recode(var2, !!!recode_list[["var2"]])
  )
#> # A tibble: 4 x 2
#>    var1  var2
#>   <dbl> <dbl>
#> 1     1     0
#> 2     1    -1
#> 3     2     1
#> 4     3     1

当我尝试应用函数对所有变量执行此操作时,似乎失败了

When I try to apply a function to do this for all variables, it seems to fail

# this does not work.
df %>%
  mutate_at(vars(var1, var2), ~{

    var_name <- rlang::quo_name(quo(.))

    recode(., !!!recode_list[[var_name]])
  }
  )
#> Error in expr_interp(f): object 'var_name' not found

我也尝试了 rlang :: as_name rlang :: as_label ,但我认为我无法真正捕获变量名称作为字符串来使用它来对变量进行子集化 recode_list .

I also tried rlang::as_name and rlang::as_label but I think I cannot really capture the name of the variable as a string to use it to subset the recode_list.


df %>%
  mutate_at(vars(var1, var2), ~ {
    var_name <- rlang::as_name(quo(.))
    print(var_name)
    #recode(., !!!recode_list[["var2"]])
  }
  )
#> [1] "."
#> [1] "."
#> # A tibble: 4 x 2
#>   var1  var2 
#>   <chr> <chr>
#> 1 .     .    
#> 2 .     .    
#> 3 .     .    
#> 4 .     .


Created on 2020-04-30 by the reprex package (v0.3.0)

推荐答案

这对您有用吗?

library(dplyr)
library(rlang)
df %>% 
  mutate_at(vars(var1,var2),
            .funs = function(x){recode_list %<>% .[[as_label(enquo(x))]]
            recode(x,!!!recode_list)})
## A tibble: 4 x 2
#   var1  var2
#  <dbl> <dbl>
#1     1     0
#2     1    -1
#3     2     1
#4     3     1

我怀疑将子集 recode_list 直接放入 recode 时这可行是不是因为 enquo 延迟了对 x ,直到分配%<>%.然后 !!! 可以在先前经过正确评估之后强制进行评估.

I suspect this works while placing the subset recode_list directly into recode does not is because enquo delays evaluation of x until assignment with %<>%. Then !!! can force evaluation after it has been properly evaluated previously.

修改

您使用 rlang 的方法也可以进行一些修改:

Your approach with rlang also works with some modifications:

library(rlang)
df %>%
  mutate_at(vars(var1, var2), function(x) {
    var_name <- rlang::as_label(substitute(x))
    recode(x, !!!recode_list[[var_name]])
  })

这篇关于访问"mutate_at"中的列名,以将其用于列表的子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆