dplyr通过评估查找单元格值来突变特定的列 [英] dplyr mutate specific columns by evaluating lookup cell value

查看：53 发布时间：2020/10/26 3:50:40 r dplyr eval

本文介绍了dplyr通过评估查找单元格值来突变特定的列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经使用等价物，符号和评估法探索了各种选项，但似乎无法获得正确的语法。这是一个示例数据框。

I have explored various options using quosures, symbols, and evaluation, but I can't seem to get the right syntax. Here is an example dataframe.

data.frame("A" = letters[1:4], "B" = letters[26:23], "C" = letters[c(1,3,5,7)], "D" = letters[c(2,4,6,8)], "pastecols" = c("B, C","B, D", "B, C, D", NA))
  A B C D pastecols
1 a z a b      B, C
2 b y c d      B, D
3 c x e f   B, C, D
4 d w g h      <NA>

现在假设我想基于pastecols中的查找字符串粘贴来自不同列的值，而我总是想要包括A列。这是我想要的结果：

Now suppose I want to paste values from different columns based on the lookup string in pastecols, and I always want to include column A. This is my desired result:

  A B C D pastecols  result
1 a z a b      B, C   a z a
2 b y c d      B, D   b y d
3 c x e f   B, C, D c x e f
4 d w g h      <NA>       d

理想情况下，这可以在dplyr中完成。这是我得到的最接近的值：

Ideally this could be done in dplyr. This is the closest I have gotten:

x %>% mutate(result = lapply(lapply(str_split(pastecols, ", "), c, "A"), na.omit))
  A B C D pastecols     result
1 a z a b      B, C    B, C, A
2 b y c d      B, D    B, D, A
3 c x e f   B, C, D B, C, D, A
4 d w g h      <NA>          A

推荐答案

这是使用的一种方法pmap 做类似的事情。 pmap 可用于通过捕获每一行作为命名矢量来有效地逐行处理数据帧；然后，您可以使用 [ pastecols] 选择要获取的列名称，以使其作为 cols 进行索引。

Here's one way using pmap to do a similar thing. pmap can be used to effectively work on dataframes by row by capturing each row as a named vector; you can then get the desired column names for indexing as cols by selecting them with ["pastecols"].

大多数匿名函数语法不是 tidyverse 东西，而是基本的R东西。要遍历它：

Most of the anonymous function syntax is not tidyverse stuff, but just basic R stuff. To walk through it:

将数据框作为列表传递到 .l pmap_chr 的参数。请记住，数据框是列的列表！

使用 c（。捕获所有 ... 参数。。）。基本上，我们将数据框的每一行称为函数的参数；现在 row 是包含行的命名向量。请注意，如果您有列表列，则该列会中断，（但是这里还有很多其他事情，所以我认为这里没有...）

我们可以获取值我们想要从 row [ pastecols] 中获得的 row ，但是我们需要转（说） B，C 放入 c（ A， B， C）即可。下一行仅添加 A ，将丢失的值替换为 A ，如果有则拆分为几部分任何，然后再向下索引到列表中。 [[部分就是在管道链中执行 list [[1]] 的方式，这是前缀之所以需要这种形式，是因为 str_split 返回一个列表，而我们只需要向量。

使用此 cols 向量从 row 行中获取所需的值并将其返回，折叠成长度为1个字符的向量！

Pass the dataframe as the list to the .l argument of pmap_chr. Remember that dataframes are lists of columns!
Capture all the ... arguments with c(...). Basically we are calling each row of the dataframe as arguments to the function; now row is a named vector containing the row. Note that if you have list-columns this will break, (but so will a lot of other things here so I assume there aren't any...)
We can get the values of row that we want from row["pastecols"], but we need to turn (say) "B, C" into c("A", "B", "C") to do that. This next line just adds the "A", replaces missing values with "A", splits into pieces if there are any, and then indexes back down into the list. The [[ part is just how you do list[[1]]" in a pipe chain, it's the prefix form of the operator. You need this because str_split returns a list and we just want the vector.
Use this cols vector to get the desired values from row and return it, collapsed into a length 1 character vector!

library(tidyverse)
tbl <- tibble("A" = letters[1:4], "B" = letters[26:23], "C" = letters[c(1,3,5,7)], "D" = letters[c(2,4,6,8)], "pastecols" = c("B, C","B, D", "B, C, D", NA))

tbl %>%
  mutate(result = pmap_chr(
    .l = .,
    .f = function(...){
      row <-  c(...)
      cols <- row["pastecols"] %>% str_c("A, ", .) %>% replace_na("A") %>% str_split(", ") %>% `[[`(1)
      vals <- row[cols] %>% str_c(collapse = ", ")
      return(vals)
    }
  ))
#> # A tibble: 4 x 6
#>   A     B     C     D     pastecols result    
#>   <chr> <chr> <chr> <chr> <chr>     <chr>     
#> 1 a     z     a     b     B, C      a, z, a   
#> 2 b     y     c     d     B, D      b, y, d   
#> 3 c     x     e     f     B, C, D   c, x, e, f
#> 4 d     w     g     h     <NA>      d

由 reprex软件包（v0.2.0）。

这篇关于dplyr通过评估查找单元格值来突变特定的列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

dplyr通过评估查找单元格值来突变特定的列 [英] dplyr mutate specific columns by evaluating lookup cell value

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

dplyr通过评估查找单元格值来突变特定的列 [英] dplyr mutate specific columns by evaluating lookup cell value

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭