在dplyr mutate_at调用中使用多列函数 [英] Using functions of multiple columns in a dplyr mutate_at call
问题描述
我想使用dplyr的 mutate_at
函数将一个函数应用于数据帧中的多个列,其中该函数也会输入直接应用于该列的列
I'd like to use dplyr's mutate_at
function to apply a function to several columns in a dataframe, where the function inputs the column to which it is directly applied as well as another column in the dataframe.
作为具体示例,我希望对以下数据框进行突变
As a concrete example, I'd look to mutate the following dataframe
# Example input dataframe
df <- data.frame(
x = c(TRUE, TRUE, FALSE),
y = c("Hello", "Hola", "Ciao"),
z = c("World", "ao", "HaOlam")
)
带有与此类似的 mutate_at
调用
df %>%
mutate_at(.vars = vars(y, z),
.funs = ifelse(x, ., NA))
返回看起来像这样的数据框
to return a dataframe that looks something like this
# Desired output dataframe
df2 <- data.frame(x = c(TRUE, TRUE, FALSE),
y_1 = c("Hello", "Hola", NA),
z_1 = c("World", "ao", NA))
所需的 mutate_at
调用类似于以下对 mutate
的调用:
The desired mutate_at
call would be similar to the following call to mutate
:
df %>%
mutate(y_1 = ifelse(x, y, NA),
z_1 = ifelse(x, z, NA))
我知道这可以通过几种方式在base R中完成,但是我特别想使用dplyr的来实现此目标。 mutate_at
函数是为了提高可读性,与数据库接口等。
I know that this can be done in base R in several ways, but I would specifically like to accomplish this goal using dplyr's mutate_at
function for the sake of readability, interfacing with databases, etc.
以下是一些关于stackoverflow的类似问题,没有解决我在这里提出的问题:
Below are some similar questions asked on stackoverflow which do not address the question I posed here:
使用dplyr的mutate()函数在sum()函数中使用列
推荐答案
@ eipi10在对问题的评论中,@ eipi10回答了此问题,但我在此写下来是后代
This was answered by @eipi10 in @eipi10's comment on the question, but I'm writing it here for posterity.
此处的解决方案是使用:
The solution here is to use:
df %>%
mutate_at(.vars = vars(y, z),
.funs = list(~ ifelse(x, ., NA)))
您还可以将新的 across()
函数与 mutate()
一起使用,如下所示:
You can also use the new across()
function with mutate()
, like so:
df %>%
mutate(across(c(y, z), ~ ifelse(x, ., NA)))
使用公式运算符(如〜ifelse(...)
)此处表示 ifelse(x,。,NA)
是匿名函数,正在调用 mutate_at()时定义
。
The use of the formula operator (as in ~ ifelse(...)
) here indicates that ifelse(x, ., NA)
is an anonymous function that is being defined within the call to mutate_at()
.
这类似于在外部定义函数调用 mutate_at()
的方法,例如:
This works similarly to defining the function outside of the call to mutate_at()
, like so:
temp_fn <- function(input) ifelse(test = df[["x"]],
yes = input,
no = NA)
df %>%
mutate_at(.vars = vars(y, z),
.funs = temp_fn)
注意dplyr中的语法更改::在dplyr 0.8.0版之前,您只需编写 .funs = funs(ifelse(x,。 ,不适用)
,但 funs()
函数已被弃用,并将很快从dplyr中删除。
Note on syntax changes in dplyr: Prior to dplyr version 0.8.0, you would simply write .funs = funs(ifelse(x, . , NA))
, but the funs()
function is being deprecated and will soon be removed from dplyr.
这篇关于在dplyr mutate_at调用中使用多列函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!