在 dplyr mutate_at 调用中使用多列的函数 [英] Using functions of multiple columns in a dplyr mutate_at call
问题描述
我想使用 dplyr 的 mutate_at
函数将函数应用于数据框中的多列,其中该函数输入直接应用它的列以及数据框中的另一列.
I'd like to use dplyr's mutate_at
function to apply a function to several columns in a dataframe, where the function inputs the column to which it is directly applied as well as another column in the dataframe.
作为一个具体的例子,我希望改变以下数据帧
As a concrete example, I'd look to mutate the following dataframe
# Example input dataframe
df <- data.frame(
x = c(TRUE, TRUE, FALSE),
y = c("Hello", "Hola", "Ciao"),
z = c("World", "ao", "HaOlam")
)
带有与此类似的 mutate_at
调用
with a mutate_at
call that looks similar to this
df %>%
mutate_at(.vars = vars(y, z),
.funs = ifelse(x, ., NA))
返回一个看起来像这样的数据帧
to return a dataframe that looks something like this
# Desired output dataframe
df2 <- data.frame(x = c(TRUE, TRUE, FALSE),
y_1 = c("Hello", "Hola", NA),
z_1 = c("World", "ao", NA))
所需的 mutate_at
调用将类似于以下对 mutate
的调用:
The desired mutate_at
call would be similar to the following call to mutate
:
df %>%
mutate(y_1 = ifelse(x, y, NA),
z_1 = ifelse(x, z, NA))
我知道这可以通过多种方式在基础 R 中完成,但我特别想使用 dplyr 的 mutate_at
函数来实现这个目标,以提高可读性、与数据库的接口等.
I know that this can be done in base R in several ways, but I would specifically like to accomplish this goal using dplyr's mutate_at
function for the sake of readability, interfacing with databases, etc.
以下是一些在 stackoverflow 上提出的类似问题,没有解决我在这里提出的问题:
Below are some similar questions asked on stackoverflow which do not address the question I posed here:
在 sum() 函数中使用列dplyr 的 mutate() 函数
推荐答案
@eipi10 在 @eipi10 对该问题的评论中回答了这个问题,但我写在这里是为了后代.
This was answered by @eipi10 in @eipi10's comment on the question, but I'm writing it here for posterity.
这里的解决方法是使用:
The solution here is to use:
df %>%
mutate_at(.vars = vars(y, z),
.funs = list(~ ifelse(x, ., NA)))
您还可以将新的 across()
函数与 mutate()
一起使用,如下所示:
You can also use the new across()
function with mutate()
, like so:
df %>%
mutate(across(c(y, z), ~ ifelse(x, ., NA)))
此处使用公式运算符(如 ~ ifelse(...)
)表明 ifelse(x, ., NA)
是一个匿名函数,它在对 mutate_at()
的调用中定义.
The use of the formula operator (as in ~ ifelse(...)
) here indicates that ifelse(x, ., NA)
is an anonymous function that is being defined within the call to mutate_at()
.
这类似于在调用 mutate_at()
之外定义函数,如下所示:
This works similarly to defining the function outside of the call to mutate_at()
, like so:
temp_fn <- function(input) ifelse(test = df[["x"]],
yes = input,
no = NA)
df %>%
mutate_at(.vars = vars(y, z),
.funs = temp_fn)
注意 dplyr 中的语法变化: 在 dplyr 0.8.0 版之前,您只需编写 .funs = funs(ifelse(x, . , NA))
,但 funs()
函数已被弃用,很快就会从 dplyr 中删除.
Note on syntax changes in dplyr: Prior to dplyr version 0.8.0, you would simply write .funs = funs(ifelse(x, . , NA))
, but the funs()
function is being deprecated and will soon be removed from dplyr.
这篇关于在 dplyr mutate_at 调用中使用多列的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!