如何将未加引号的列名列表馈入`lapply`(以便我可以将其与`dplyr`函数一起使用) [英] How to feed a list of unquoted column names into `lapply` (so that I can use it with a `dplyr` function)

查看:94
本文介绍了如何将未加引号的列名列表馈入`lapply`(以便我可以将其与`dplyr`函数一起使用)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在 tidyverse / dplyr 中编写一个函数,最终要与 lapply 一起使用(或地图)。 (我一直在将它处理到 ,但遇到一个有趣的结果/死角。请不要将其标记为重复-这个问题是您在此处看到的答案的延伸/偏离。)

I am trying to write a function in tidyverse/dplyr that I want to eventually use with lapply (or map). (I had been working on it to answer this question, but came upon an interesting result/dead-end. Please don't mark this as a duplicate - this question is an extension/departure from the answers that you see there.)

是否存在

1)一种获取带引号的变量列表以在dplyr函数中工作的方法

(并且不使用不推荐使用的 SE _ 函数)或在那里

2)通过 lapply map

Is there
1) a way to get a list of quoted variables to work inside a dplyr function
(and not use the deprecated SE_ functions) or is there
2) some way to feed a list of unquoted strings through an lapply or map

我使用了 使用Dplyr 小插图进行编程以构造我认为最符合当前与NS配合使用的标准
的功能E。

I have used the Programming in Dplyr vignette to construct what I believe is a function most in line with the current standard for working with the NSE.

sample_data <- 
    read.table(text = "REVENUEID AMOUNT  YEAR REPORT_CODE PAYMENT_METHOD INBOUND_CHANNEL  AMOUNT_CAT
               1 rev-24985629     30  FY18           S          Check            Mail     25,50
               2 rev-22812413      1  FY16           Q          Other      Canvassing   0.01,10
               3 rev-23508794    100  FY17           Q    Credit_card             Web   100,250
               4 rev-23506121    300  FY17           S    Credit_card            Mail   250,500
               5 rev-23550444    100  FY17           S    Credit_card             Web   100,250
               6 rev-21508672     25  FY14           J          Check            Mail     25,50
               7 rev-24981769    500  FY18           S    Credit_card             Web 500,1e+03
               8 rev-23503684     50  FY17           R          Check            Mail     50,75
               9 rev-24982087     25  FY18           R          Check            Mail     25,50
               10 rev-24979834     50  FY18           R    Credit_card             Web    50,75
                      ", header = TRUE, stringsAsFactors = FALSE)


报告生成功能


report <- function(report_cat){
    report_cat <- enquo(report_cat)
    sample_data %>%
    group_by(!!report_cat, YEAR) %>%
    summarize(num=n(),total=sum(AMOUNT)) %>% 
    rename(REPORT_VALUE = !!report_cat) %>% 
    mutate(REPORT_CATEGORY := as.character(quote(!!report_cat))[2])
}

对于生成单个报告可以很好地工作:

Which works fine for generating a single report:


> report(REPORT_CODE)
# A tibble: 7 x 5
# Groups:   REPORT_VALUE [4]
  REPORT_VALUE  YEAR   num total REPORT_CATEGORY
         <chr> <chr> <int> <int>           <chr>
1            J  FY14     1    25     REPORT_CODE
2            Q  FY16     1     1     REPORT_CODE
3            Q  FY17     1   100     REPORT_CODE
4            R  FY17     1    50     REPORT_CODE
5            R  FY18     2    75     REPORT_CODE
6            S  FY17     2   400     REPORT_CODE
7            S  FY18     2   530     REPORT_CODE


当我尝试建立所有要生成的所有4个报告的列表时,一切都崩溃了。 (尽管诚然,函数最后一行所需的代码(返回用来填充列的字符串)应该足够聪明,以至于我在错误的方向上徘徊了。)

It is when I try and set up a list of all 4 of the reports to generate, that everything breaks down. (Though admittedly the code required in that last line of the function - to return a string with which to then fill the column - should be clue enough that I have wandered off in the wrong direction.)

#the other reports
cat.list <- c("REPORT_CODE","PAYMENT_METHOD","INBOUND_CHANNEL","AMOUNT_CAT")

# Applying and Mapping attempts 
lapply(cat.list, report)
map_df(cat.list, report)

这将导致:


> lapply(cat.list, report)  
 Error in (function (x, strict = TRUE)  : 
  the argument has already been evaluated  

> map_df(cat.list, report)
 Error in (function (x, strict = TRUE)  : 
  the argument has already been evaluated


我还尝试过将字符串列表转换为名称,然后再将其交给 apply map

I have also tried to convert the list of strings to names before handing it over to apply and map:

library(rlang)
cat.names <- lapply(cat.list, sym)
lapply(cat.names, report)
map_df(cat.names, report)



> lapply(cat.names, report)
 Error in (function (x, strict = TRUE)  : 
  the argument has already been evaluated 
> map_df(cat.names, report)
 Error in (function (x, strict = TRUE)  : 
  the argument has already been evaluated


无论如何,我问这个问题的原因是我认为我已经按照当前记录的标准编写了该函数,但是最终我看不到任何利用 apply 甚至是成员的方法。 purrr :: map 系列具有这样的功能。只需像 useR 那样使用名称来重写函数,就可以在 https://stackoverflow.com/a/47316151/5088194 是否可以使此功能与 apply 或<$ c $一起使用c> map ?

In any case, the reason I am asking this question is that I think that I have written the function to the currently documented standards, but ultimately I can then see no way to utilize a member of the apply or even of the purrr::map family with such a function. Short of rewriting the function to use names like useR has done here https://stackoverflow.com/a/47316151/5088194 is there a way to get this function to work with apply or map?

我希望看到这样的结果:

I am hoping to see this as a result:


# A tibble: 27 x 5
# Groups:   REPORT_VALUE [16]
   REPORT_VALUE  YEAR   num total REPORT_CATEGORY
          <chr> <chr> <int> <int>           <chr>
 1            J  FY14     1    25     REPORT_CODE
 2            Q  FY16     1     1     REPORT_CODE
 3            Q  FY17     1   100     REPORT_CODE
 4            R  FY17     1    50     REPORT_CODE
 5            R  FY18     2    75     REPORT_CODE
 6            S  FY17     2   400     REPORT_CODE
 7            S  FY18     2   530     REPORT_CODE
 8        Check  FY14     1    25  PAYMENT_METHOD
 9        Check  FY17     1    50  PAYMENT_METHOD
10        Check  FY18     2    55  PAYMENT_METHOD
# ... with 17 more rows



推荐答案

as.name 会将字符串转换为名称,并且可以将其传递给 report

as.name will convert a string to a name and that can be passed to report:

lapply(cat.list, function(x) do.call("report", list(as.name(x))))

字符参数的另一种方法是重写 report ,以便它接受一个字符字符串参数:

character argument An alternative is to rewrite report so that it accepts a character string argument:

report_ch <- function(colname) {  
    report_cat <- rlang::sym(colname)   # as.name(colname) would also work here
    sample_data %>%
                group_by(!!report_cat, YEAR) %>%
                summarize(num = n(), total = sum(AMOUNT)) %>% 
                rename(REPORT_VALUE = !!report_cat) %>% 
                mutate(REPORT_CATEGORY = colname)
}

lapply(cat.list, report_ch)

wrapr 使用包装程序包的 report 可以替代rlang / tidyeval:

wrapr An alternate approach is to rewrite report using the wrapr package which is an alternative to rlang/tidyeval:

library(dplyr)
library(wrapr)

report_wrapr <- function(colname) 
  let(c(COLNAME = colname),
      sample_data %>%
                  group_by(COLNAME, YEAR) %>%
                  summarize(num = n(), total = sum(AMOUNT)) %>%
                  rename(REPORT_VALUE = COLNAME) %>%
                  mutate(REPORT_CATEGORY = colname)
   )

lapply(cat.list, report_wrapr)

当然,如果您使用其他框架,例如

Of course, this whole problem would go away if you used a different framework, e.g.

plyr

library(plyr)

report_plyr <- function(colname)
  ddply(sample_data, c(REPORT_VALUE = colname, "YEAR"), function(x)
     data.frame(num = nrow(x), total = sum(x$AMOUNT), REPORT_CATEOGRY = colname))

lapply(cat.list, report_plyr)

sqldf

library(sqldf)

report_sql <- function(colname, envir = parent.frame(), ...)
  fn$sqldf("select [$colname] REPORT_VALUE,
                   YEAR,
                   count(*) num,
                   sum(AMOUNT) total,
                   '$colname' REPORT_CATEGORY
            from sample_data
            group by [$colname], YEAR", envir = envir, ...)

lapply(cat.list, report_sql)              

基础-

report_base_by <- function(colname)
      do.call("rbind", 
        by(sample_data, sample_data[c(colname, "YEAR")], function(x)
            data.frame(REPORT_VALUE = x[1, colname], 
                       YEAR = x$YEAR[1], 
                       num = nrow(x), 
                       total = sum(x$AMOUNT), 
                       REPORT_CATEGORY = colname)
         )
      )

lapply(cat.list, report_base_by)

data.table data.table包提供了另一种选择,但已经被另一个答案所涵盖。

data.table The data.table package provides another alternative but that has already been covered by another answer.

更新::添加了其他替代方法。

Update: Added additional alternatives.

这篇关于如何将未加引号的列名列表馈入`lapply`(以便我可以将其与`dplyr`函数一起使用)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆