泛化用于自定义函数的for循环 [英] Generalize a for-loop for use in a custom function

查看：95 发布时间：2020/11/10 0:43:27 r for-loop purrr

本文介绍了泛化用于自定义函数的for循环的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

使用下面的for循环，我可以创建给定员工之上的所有经理的列表(基本上是员工经理，经理经理等的列表)

Using the for-loop below I can create a list of all managers above a given employee (essentially a list of an employee's manager, her manager's manager, etc.)

library(dplyr)
library(tidyr)
library(purrr)

# Create test data 
ds <-
  tibble(
    emp_id = c("001", "002", "003", "004", "005"),
    mgr_id  = c("002", "004", "004", "005", NA)
  )

# Hardcoded for-loop example 
  mgr_ids_above <- vector("list", length = 5)
  id <- "001"

  for (i in seq_along(mgr_ids_above)) {
    mgr_ids_above[[i]] <- ds$mgr_id[ds$emp_id == id]

    id <- mgr_ids_above[[i]]
  }

  # drop NAs
  mgr_ids_above <- unlist(mgr_ids_above)
  mgr_ids_above <- mgr_ids_above[!is.na(mgr_ids_above)]

  # return to list format
  as.list(mgr_ids_above)

我希望将此for循环应用于整个数据框，并将结果保存在列表列中.我可以使用pmap()成功地执行此操作，以将硬编码的for循环应用于我的数据帧，但是当我尝试编写通用函数时，一切都崩溃了.

My hope is to apply this for-loop to the entire data frame and save the results in a list-column. I can successfully do this using pmap() to apply a hard-coded for-loop to my data frame, but when I try to write a generalized function, everything falls apart.

# Define custom function with hardcoded data and variable names
get_mgrs_above <- function(id, max_steps = 5){

  mgr_ids_above <- vector("list", length = max_steps)

  for (i in seq_along(mgr_ids_above)) {
    mgr_ids_above[[i]] <- ds$mgr_id[ds$emp_id == id]

    id <- mgr_ids_above[[i]]
  }

  # drop NAs
  mgr_ids_above <- unlist(mgr_ids_above)
  mgr_ids_above <- mgr_ids_above[!is.na(mgr_ids_above)]

  # return to list format
  as.list(mgr_ids_above)
}

# Apply custom function
ds_mgrs_above <-
  ds %>%
  mutate(
    ranks_above = pmap(
      list(id = emp_id),
      get_mgrs_above
    )
  )

以上代码的输出为

A tibble: 5 x 3
emp_id mgr_id ranks_above
  <chr>  <chr>  <list>     
1 001    002    <list [3]> 
2 002    004    <list [2]> 
3 003    004    <list [2]> 
4 004    005    <list [1]> 
5 005    NA     <list [0]>

ranks_above列表列的内容如下

ds_mgrs_above$ranks_above[[1]]

[[1]]
[1] "002"

[[2]]
[1] "004"

[[3]]
[1] "005"

我将所有数据和变量作为参数提供的失败函数失败，并显示消息"mutate_impl(.data，点)中的错误: 评估错误:元素1的长度为2，而不是1或5.:

My failing function with all data and variables supplied as arguments fails with the message, "Error in mutate_impl(.data, dots) : Evaluation error: Element 1 has length 2, not 1 or 5..":

get_mgrs_above <- function(
  data,
  id = emp_id,
  mgr_id = mgr_id,
  emp_id = emp_id,
  max_steps = 5){

  mgr_ids_above <- vector("list", length = max_steps)

  for (i in seq_along(mgr_ids_above)) {
    mgr_ids_above[[i]] <- data$mgr_id[data$emp_id == id]

    id <- mgr_ids_above[[i]]
  }

  # drop NAs
  mgr_ids_above <- unlist(mgr_ids_above)
  mgr_ids_above <- mgr_ids_above[!is.na(mgr_ids_above)]

  # return to list format
  as.list(mgr_ids_above)
}

ds %>%
  mutate(
    ranks_above = pmap(
      list(
        data = ds,
        id = emp_id,
        mgr_id = mgr_id,
        emp_id = emp_id,
        max_steps = 5
      ),
      get_mgrs_above
    )
  )

为避免混淆，这是一篇有关如何编写可归纳函数的文章，该函数将从两列创建一个列表列.这是对拥有约1.5万名员工的数据帧进行较大数据整理尝试的一个组成部分.

To avoid confusion, this is a post about how to write a generalizable function that will create a list column from two columns. This is one component of a larger data munging attempt on a data frame with ~15k employees.

泛化用于自定义函数的for循环 [英] Generalize a for-loop for use in a custom function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

泛化用于自定义函数的for循环 [英] Generalize a for-loop for use in a custom function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭