将列添加到 df 中,该列是使用不同列值组合为向量输入的函数的输出 [英] Add column to df that's the output of a function that uses different column values combined to be a vector input
问题描述
这是我实际问题的一个非常简化的版本.
This is a very simplified version of my actual problem.
我真正的 df
有很多列,我需要使用 select
从列名的字符向量中执行此操作.
My real df
has many columns and I need to perform this action using a select
from a character vector of column names.
library(tidyverse)
df <- data.frame(a1 = c(1:5),
b1 = c(3,1,3,4,6),
c1 = c(10:14),
a2 = c(9:13),
b2 = c(3:7),
c2 = c(15:19))
df
a1 b1 c1 a2 b2 c2
1 1 3 10 9 3 15
2 2 1 11 10 4 16
3 3 3 12 11 5 17
4 4 4 13 12 6 18
5 5 6 14 13 7 19
假设我想使用 mutate
为所选列的每一行获取 cor
- 我试过:
Let's say I wanted to get the cor
for each row for selected columns using mutate
- I tried:
df %>%
mutate(my_cor = cor(x = c(a1,b1,c2), y = c(a2,b2,c2)))
但这不起作用,因为它为每个列标题输入使用完整的数据列.
but this doesn't work as it uses the full column of data for each column header input.
上面输出df
的my_cor
列的第一行应该是计算:
The first row of the my_cor
column of the output df
from above should be the calculation:
cor(x = c(1,3,10), y = c(9,3,15))
下一行应该是:
cor(x = c(2,1,11), y = c(10,4,16))
等等.我使用的实际函数更复杂,但它确实需要两个向量输入,就像 cor
那样,所以我认为这将是一个很好的代理.
and so on. The actual function I'm using is more complex but it does take two vector inputs like cor
does so I figured this would be a good proxy.
我觉得我应该使用 purrr
来执行此操作 (类似于这篇文章) 但我还没有让它工作.
I have a feeling I should be using purrr
for this action (similar to this post) but I haven't gotten it to work.
奖励:我面临的实际问题是使用的函数会使用许多不同的列,所以我希望能够从字符向量中select
它们像 my_list_of_cols <- c("a1", "b1", "c1")
(我的真实列表要长得多).
Bonus: The actual problem I'm facing is using a function that would use many different columns so I'd like to be able select
them from a a character vector like my_list_of_cols <- c("a1", "b1", "c1")
(my true list is much longer).
我怀疑我会使用 pmap_dbl
就像我链接到的帖子一样,但我无法让它工作 - 我尝试了类似的东西......
I suspect I'd be using pmap_dbl
like the post I linked to but I can't get it to work - I tried something like...
mutate(my col = pmap_dbl(select(., var = my_list_of_cols), somefunction))
(请注意,上述部分中的 somefunction
接受 2 个向量输入,但其中一个是静态且预定义的 - 您可以假设向量 c(a2, b2, c2)
是静态的和预定义的,如:
(note that somefunction
in the above portion takes a 2 vector inputs but one of them is static and pre-defined - you can assume the vector c(a2, b2, c2)
is the static and predefined one like:
somefunction <- function(a1,b1,c1){
a2 = 1
b2 = 4
c2 = 5
my_vec = c(a2, b2, c2)
cor(x = (a1,b1,c1), y = my_vec)
}
)
我仍在学习如何使用 purrr
所以任何帮助将不胜感激!
I'm still learning how to use purrr
so any help would be greatly appreciated!
推荐答案
这里有一个选项可以将列名和其他名称的对象传递给 select
Here is one option to pass an object of column names and other names passed into select
library(tidyverse)
my_list_of_cols <- c("a1", "b1", "c1")
another_list_cols <- c("a2", "b2", "c2")
df %>%
mutate(my_cor = pmap_dbl(
select(., my_list_of_cols,
another_list_cols), ~ c(...) %>%
{cor(.[my_list_of_cols], .[setdiff(names(.), my_list_of_cols)])}
))
这篇关于将列添加到 df 中,该列是使用不同列值组合为向量输入的函数的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!