功能中的动态选择表达式 [英] Dynamic select expression in function

查看:101
本文介绍了功能中的动态选择表达式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写一个将转换此数据帧的函数

I am trying to write a function that will convert this data frame

library(dplyr)
library(rlang)
library(purrr)

df <- data.frame(obj=c(1,1,2,2,3,3,3,4,4,4),
                 S1=rep(c("a","b"),length.out=10),PR1=rep(c(3,7),length.out=10),
                 S2=rep(c("c","d"),length.out=10),PR2=rep(c(7,3),length.out=10))

   obj S1 PR1 S2 PR2
1    1  a   3  c   7
2    1  b   7  d   3
3    2  a   3  c   7
4    2  b   7  d   3
5    3  a   3  c   7
6    3  b   7  d   3
7    3  a   3  c   7
8    4  b   7  d   3
9    4  a   3  c   7
10   4  b   7  d   3

进入此数据框

df %>% {bind_rows(select(., obj, S = S1, PR = PR1),
              select(., obj, S = S2, PR = PR2))}
   obj S PR
1    1 a  3
2    1 b  7
3    2 a  3
4    2 b  7
5    3 a  3
6    3 b  7
7    3 a  3
8    4 b  7
9    4 a  3
10   4 b  7
11   1 c  7
12   1 d  3
13   2 c  7
14   2 d  3
15   3 c  7
16   3 d  3
17   3 c  7
18   4 d  3
19   4 c  7
20   4 d  3

但是我希望该函数能够使用任意数量的列.因此,如果我有S1,S2,S3,S4或还有其他类别(即DS1,DS2),它也将起作用.理想情况下,该函数将采用以下模式作为参数:确定哪些列彼此堆叠,每个列的集合数,输出列的名称以及也应保留的任何变量的名称.

But I would like the function to be able to work with any number of columns. So it would also work if I had S1, S2, S3, S4 or if there was an additional category ie DS1, DS2. Ideally the function would take as arguments the patterns that determine which columns are stacked on top of each other, the number of sets of each column, the names of the output columns and the names of any variables that should also be kept.

这是我尝试的此功能:

stack_col <- function(df, patterns, nums, cnames, keep){
  keep <- enquo(keep)
  build_exp <- function(x){
   paste0("!!sym(cnames[[", x, "]]) := paste0(patterns[[", x, "]],num)") %>% 
      parse_expr()
  }
  exps <- map(1:length(patterns), ~expr(!!build_exp(.)))

  sel_fun <- function(num){
    df %>% select(!!keep, 
                  !!!exps)
  }
  map(nums, sel_fun) %>% bind_rows()
}

我可以让sel_fun部分用于固定数量的模式,例如

I can get the sel_fun part to work for a fixed number of patterns like this

patterns <- c("S", "PR")
cnames <- c("Species", "PR")
keep <- quo(obj)
sel_fun <- function(num){
df %>% select(!!keep,
!!sym(cnames[[1]]) := paste0(patterns[[1]], num),
!!sym(cnames[[2]]) := paste0(patterns[[2]], num))
}
sel_fun(1)

但是我尝试过的动态版本无法正常工作并出现此错误:

But the dynamic version that I have tried does not work and gives this error:

Error: `:=` can only be used within a quasiquoted argument

推荐答案

此处是获取预期输出的函数.使用map2gather将'patterns'和相应的新列名('cnames')循环为'long'格式,rename将'val'列与传递给函数的'cnames'绑定,列(bind_cols)和select感兴趣的列

Here is a function to get the expected output. Loop through the 'patterns' and the corresponding new column names ('cnames') using map2, gather into 'long' format, rename the 'val' column to the 'cnames' passed into the function, bind the columns (bind_cols) and select the columns of interest

stack_col <- function(dat, pat, cname, keep) {

    purrr::map2(pat, cname, ~ 
                    dat %>%
                       dplyr::select(keep, matches(.x)) %>%
                       tidyr::gather(key, val, matches(.x)) %>%
                       dplyr::select(-key) %>%
                       dplyr::rename(!! .y := val)) %>%
       dplyr::bind_cols(.) %>%
       dplyr::select(keep, cname) 



}

stack_col(df, patterns, cnames, 1)
#    obj Species PR
#1    1       a  3
#2    1       b  7
#3    2       a  3
#4    2       b  7
#5    3       a  3
#6    3       b  7
#7    3       a  3
#8    4       b  7
#9    4       a  3
#10   4       b  7
#11   1       c  7
#12   1       d  3
#13   2       c  7
#14   2       d  3
#15   3       c  7
#16   3       d  3
#17   3       c  7
#18   4       d  3
#19   4       c  7
#20   4       d  3


此外,可以使用data.table::melt

library(data.table)
melt(setDT(df), measure = patterns("^S\\d+", "^PR\\d+"), 
          value.name = c("Species", "PR"))[, variable := NULL][]

这篇关于功能中的动态选择表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆