利用dplyr中cross()中的函数来处理成对的列 [英] Utilizing functions within across() in dplyr to work with paired-columns

查看：58 发布时间：2021/5/2 20:52:32 r function dplyr across

本文介绍了利用dplyr中cross()中的函数来处理成对的列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

set.seed(3)
library(dplyr)
x <- tibble(Measure = c("Height","Weight","Width","Length"),
        AD1_1= rpois(4,10),
        AD1_2= rpois(4,9),
        AD2_1= rpois(4,10),
        AD2_2= rpois(4,9),
        AD3_1= rpois(4,10),
        AD3_2= rpois(4,9))

假设我的数据看起来像这样.我希望为每个广告运行一个函数，并与下划线的数字配对，即AD1fun，AD2fun，AD3fun.

Suppose I have data that looks like this. I wish to run a function for each AD, paired with underscored number, i.e., AD1fun, AD2fun,AD3fun.

而不是写作

fun <- function(x,y){x-y}
dat %>%
mutate(AD1fun = fun(AD1_1,AD1_2),
       AD2fun = fun(AD2_1,AD2_2),
...)

查找配对的差异-使用dplyr 的列显示

x_minus <- x %>%
  mutate(fun(across(ends_with("_1"), .names = "{col}_minus"), across(ends_with("_2")))) %>%
  rename_with(~ sub("_\\d+", "", .), ends_with("_minus"))

可用于生产

# A tibble: 4 x 10
  Measure AD1_1 AD1_2 AD2_1 AD2_2 AD3_1 AD3_2 AD1_minus AD2_minus AD3_minus
  <chr>   <int> <int> <int> <int> <int> <int>     <int>     <int>     <int>
1 Height      6    10    10     3    12     8        -4         7         4
2 Weight      8     9    13     6    14     7        -1         7         7
3 Width      10     9    11     5    12     8         1         6         4
4 Length      8     9     8     7     8    13        -1         1        -5

但是，如果我们要执行非操作功能，

However, if we were to make non-operational function,

fun <- function(x,y){
  case <- case_when(
    x == y ~ "Agree",
    x == 0 & y != 0 ~ "Disagreement",
    x != 0 & y == 0 ~ "Disagreement",
    x-y <= 1 & x-y >= -1 ~ "Agree",
    TRUE ~ "Disagree"
  )
  return(case)
}

x_case <- x %>%
  mutate(fun(across(ends_with("_1"), .names = "{col}_case"), across(ends_with("_2")))) %>%
  rename_with(~ sub("_\\d+", "", .), ends_with("_case"))

由于引用，它将产生一个错误

it will produce an error, since to quote,

此过程实质上意味着您要比较两个数据集:一个变量以_1结尾，一个以_2结尾.因此，是相同的as dat％>％select(ends_with("_ 1"))-dat％>％select(ends_with("_ 2"))).而且，由于这些是列表，因此您无法通过这种方式进行比较

This procedure essentially means that you compare two datasets: one with variables ending with _1 and one with _2. It is, thus, the same as dat %>% select(ends_with("_1")) - dat %>% select(ends_with("_2")). And as these are lists, you cannot compare them that way

如果是这样，怎么做才能使用cross()包含一个函数?

If so, what can be done to include a function using across()?

推荐答案

我们可以循环遍历跨名称为 ends_with "_1"的列，然后使用 cur_column()提取列名，将后缀部分替换为 _2 ， get 的值并将其用作 fun 表示当前列以及 _2

We could loop across the columns with names that ends_with "_1", then use cur_column() to extract the column name, replace the suffix part with _2, get the value and use that as argument to the fun for the current column and the corresponding pair from _2

library(dplyr)
library(stringr)
x %>% 
   mutate(across(ends_with("_1"), ~
     fun(., get(str_replace(cur_column(), "_1$", "_2"))), .names = "{.col}_case"))

-输出

# A tibble: 4 x 10
#  Measure AD1_1 AD1_2 AD2_1 AD2_2 AD3_1 AD3_2 AD1_1_case AD2_1_case AD3_1_case
#  <chr>   <int> <int> <int> <int> <int> <int> <chr>      <chr>      <chr>     
#1 Height      6    10    10     3    12     8 Disagree   Disagree   Disagree  
#2 Weight      8     9    13     6    14     7 Agree      Disagree   Disagree  
#3 Width      10     9    11     5    12     8 Agree      Disagree   Disagree  
#4 Length      8     9     8     7     8    13 Agree      Agree      Disagree

或另一个选项是 split.default/map .在这里，我们将数据集划分为 data.frame 的 list ，每个数据集都具有与列名相同的前缀，然后在每个 fun > list 元素和 map/reduce ，然后使用 bind_cols

Or another option is split.default/map. Here, we split the datasets into list of data.frame each having the same prefix as column name, then apply the fun on each list element with map/reduce and bind the output back to the original dataset with bind_cols

library(purrr)
x %>% 
  select(-Measure) %>% 
  split.default(str_remove(names(.), "_\\d+$")) %>%
  map_dfr(reduce, fun) %>% 
  rename_all(~ str_c(., "_case")) %>%
  bind_cols(x, .)

-输出

# A tibble: 4 x 10
#  Measure AD1_1 AD1_2 AD2_1 AD2_2 AD3_1 AD3_2 AD1_case AD2_case AD3_case
#  <chr>   <int> <int> <int> <int> <int> <int> <chr>    <chr>    <chr>   
#1 Height      6    10    10     3    12     8 Disagree Disagree Disagree
#2 Weight      8     9    13     6    14     7 Agree    Disagree Disagree
#3 Width      10     9    11     5    12     8 Agree    Disagree Disagree
#4 Length      8     9     8     7     8    13 Agree    Agree    Disagree

关于OP的方法， fun 不是 Vectorize d.如果这样做，它可以应用于多个成对的列

Regarding the OP's approach, the fun is not Vectorized. If we do that, it can be applied to multiple pairwise columns

x %>%
  mutate(Vectorize(fun)(across(ends_with("_1"), 
         .names = "{col}_minus"), across(ends_with("_2"))))%>%
   do.call(data.frame, .) %>% 
   rename_at(vars(contains('minus')),
         ~ str_extract(., 'AD\\d+_\\d+_minus'))
#  Measure AD1_1 AD1_2 AD2_1 AD2_2 AD3_1 AD3_2 AD1_1_minus AD2_1_minus AD3_1_minus
#1  Height     6    10    10     3    12     8    Disagree    Disagree    Disagree
#2  Weight     8     9    13     6    14     7       Agree    Disagree    Disagree
#3   Width    10     9    11     5    12     8       Agree    Disagree    Disagree
#4  Length     8     9     8     7     8    13       Agree       Agree    Disagree

这篇关于利用dplyr中cross()中的函数来处理成对的列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

利用dplyr中cross()中的函数来处理成对的列 [英] Utilizing functions within across() in dplyr to work with paired-columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

利用dplyr中cross()中的函数来处理成对的列 [英] Utilizing functions within across() in dplyr to work with paired-columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭