使用 purrr::map_df 在函数中转发参数 [英] Forwarding arguments in a function with purrr::map_df

查看:63
本文介绍了使用 purrr::map_df 在函数中转发参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个函数,该函数使用 readxl::read_excel 读取 excel 工作簿中的所有工作表并将它们绑定到单个数据框中,并允许我通过额外的read_excel 的参数.我可以做第一部分,但不能做第二部分.

I am trying to create a function that reads in all sheets in an excel workbook using readxl::read_excel and binds them into a single data frame, and allows me to pass through additional arguments to read_excel. I can do the first part fine, but not the second part.

library(magrittr)

# example excel workbook with multiple sheets
path <- readxl::readxl_example("datasets.xlsx")

# function with simple forwarding
read_all <- function(path, ...) {

  path %>%
    readxl::excel_sheets() %>%
    rlang::set_names() %>%
    purrr::map_df(~ readxl::read_excel(path = path, sheet = .x, ...))

}

# errors with and without additional arguments
read_all(path)
read_all(path, skip = 5)

我应该返回一个文件,而不是我得到一个错误:

I should get back a single file, instead I get an error:

Error: Can't guess format of this cell reference: iris
In addition: Warning message: Cell reference follows neither the A1 nor R1C1 format. Example: iris NAs generated.

没有参数传递函数工作正常:

# Function works without passing extra params
read_all_0 <- function(path) {

  path %>%
    readxl::excel_sheets() %>%
    rlang::set_names() %>%
    purrr::map_df(~ readxl::read_excel(path = path, sheet = .x))

}

read_all_0(path)

参数传递在没有 purrr::map_df

的简单函数中工作正常

Argument passing works fine in a simple function without purrr::map_df

read_test <- function(path, ...) {

  path %>% readxl::read_excel(...)
}
read_test(path, skip = 10)

推荐答案

一个可能的解决方案是创建一个命名函数,它只接受一个参数并将其传递给 map 以便唯一的参数是您正在循环的向量/列表.

A possible solution is to create a named function which only takes one argument and pass it to map so that the only argument is the vector/list you are looping over.

应用于您的问题的解决方案如下所示:

Applied to your problem a solution would look like this:

# function with forwarding
read_all <- function(path, ...) {

  # function within function that sets the arguments path and ellipsis as given and only leaves sheet to be determined
  read_xl <- function(sheet) {
    readxl::read_excel(path = path, sheet, ...)
  }

  path %>%
    readxl::excel_sheets() %>%
    rlang::set_names() %>%
    purrr::map_df(read_xl)

}

# this allows you to pass along arguments in the ellipsis correctly
read_all(path)
read_all(path, col_names = FALSE)

这个问题似乎源于对 purrr::as_mapper 函数的不当环境处理.为了避免这种情况,我建议在评论中使用匿名函数.显然,下面的方法也有效.

It seems this problem is stemming from an improper environment handling of the purrr::as_mapper function. To circumvent this, I suggested using an anonymous function in the comments. Apparently, the approach below works as well.

read_all <- function(path, ...) {

  path %>%
    readxl::excel_sheets() %>%
    rlang::set_names() %>%
    purrr::map_df(function(x) {
                      readxl::read_excel(path = path, sheet = x, ...)
                   })

}

为了验证是否确实是 as_mapper 函数导致了问题,我们可以使用 as_mapper 从上面重写命名的函数中函数.这在省略号中有和没有附加参数的情况下再次产生错误.

To verify that it is really the as_mapper function that is causing the problem, we can rewrite the named function-in-function from above using as_mapper. This again yields errors with and without additional arguments in the ellipsis.

# function with forwarding
read_all <- function(path, ...) {

  # named mapper function
  read_xl <- purrr::as_mapper(~ readxl::read_excel(path = path, sheet = .x, ...))

  path %>%
    readxl::excel_sheets() %>%
    rlang::set_names() %>%
    purrr::map_df(read_xl)

} 

<小时>

更新知道 as_mapper 是导致问题的原因后,我们可以深入挖掘问题.现在我们可以在 RStudio 调试器中检查运行简单映射器版本的 read_excel 时发生了什么:


Update Knowing that as_mapper is causing the issue allows us to dig deeper into the problem. Now we can inspect in the RStudio debugger what is happening under the hood when running a simple mapper version of read_excel:

read_xl <- purrr::as_mapper(~ readxl::read_excel(path = .x, sheet = .y, ...))
debugonce(read_xl) 
read_xl(path, 1)

似乎当省略号包含在映射器函数中时,as_mapper 不仅会将第一个参数映射到 .x,还会自动映射到省略号 ....我们可以通过创建一个简单的映射器函数 paster 来验证这一点,该函数带有两个参数 .x....

It seems that when the ellipsis is included in the mapper function, as_mapper maps the first argument not only to .x but also automatically to the ellipsis .... We can verify this by creating a simple mapper function paster taking two arguments .x and ....

paster <- purrr::as_mapper(~ paste0(.x, ...))
paster(1)
> [1] "11"
paster(2)
> [1] "22"

现在的问题是:我们是否应该在映射器函数中使用省略号的另一种方式,或者这是一个错误.

The question now is: is there another way we are supposed to use ellipsis in mapper functions or is this a bug.

这篇关于使用 purrr::map_df 在函数中转发参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆