在 dplyr mutate_at 调用中使用多列的函数 [英] Using functions of multiple columns in a dplyr mutate_at call

查看:18
本文介绍了在 dplyr mutate_at 调用中使用多列的函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 dplyr 的 mutate_at 函数将函数应用于数据框中的多列,其中该函数输入直接应用它的列以及数据框中的另一列.

I'd like to use dplyr's mutate_at function to apply a function to several columns in a dataframe, where the function inputs the column to which it is directly applied as well as another column in the dataframe.

作为一个具体的例子,我希望改变以下数据帧

As a concrete example, I'd look to mutate the following dataframe

# Example input dataframe
df <- data.frame(
    x = c(TRUE, TRUE, FALSE),
    y = c("Hello", "Hola", "Ciao"),
    z = c("World", "ao", "HaOlam")
)

带有与此类似的 mutate_at 调用

with a mutate_at call that looks similar to this

df %>%
mutate_at(.vars = vars(y, z),
          .funs = ifelse(x, ., NA))

返回一个看起来像这样的数据帧

to return a dataframe that looks something like this

# Desired output dataframe
df2 <- data.frame(x = c(TRUE, TRUE, FALSE),
                  y_1 = c("Hello", "Hola", NA),
                  z_1 = c("World", "ao", NA))

所需的 mutate_at 调用将类似于以下对 mutate 的调用:

The desired mutate_at call would be similar to the following call to mutate:

df %>%
   mutate(y_1 = ifelse(x, y, NA),
          z_1 = ifelse(x, z, NA))

我知道这可以通过多种方式在基础 R 中完成,但我特别想使用 dplyr 的 mutate_at 函数来实现这个目标,以提高可读性、与数据库的接口等.

I know that this can be done in base R in several ways, but I would specifically like to accomplish this goal using dplyr's mutate_at function for the sake of readability, interfacing with databases, etc.

以下是一些在 stackoverflow 上提出的类似问题,没有解决我在这里提出的问题:

Below are some similar questions asked on stackoverflow which do not address the question I posed here:

在 dplyr mutate 调用中添加多列

dplyr::mutate 添加多个值

在 sum() 函数中使用列dplyr 的 mutate() 函数

推荐答案

@eipi10 在 @eipi10 对该问题的评论中回答了这个问题,但我写在这里是为了后代.

This was answered by @eipi10 in @eipi10's comment on the question, but I'm writing it here for posterity.

这里的解决方法是使用:

The solution here is to use:

df %>%
   mutate_at(.vars = vars(y, z),
             .funs = list(~ ifelse(x, ., NA)))

您还可以将新的 across() 函数与 mutate() 一起使用,如下所示:

You can also use the new across() function with mutate(), like so:

df %>%
   mutate(across(c(y, z), ~ ifelse(x, ., NA)))

此处使用公式运算符(如 ~ ifelse(...))表明 ifelse(x, ., NA) 是一个匿名函数,它在对 mutate_at() 的调用中定义.

The use of the formula operator (as in ~ ifelse(...)) here indicates that ifelse(x, ., NA) is an anonymous function that is being defined within the call to mutate_at().

这类似于在调用 mutate_at() 之外定义函数,如下所示:

This works similarly to defining the function outside of the call to mutate_at(), like so:

temp_fn <- function(input) ifelse(test = df[["x"]],
                                  yes = input,
                                  no = NA)

df %>%
   mutate_at(.vars = vars(y, z),
             .funs = temp_fn)

注意 dplyr 中的语法变化: 在 dplyr 0.8.0 版之前,您只需编写 .funs = funs(ifelse(x, . , NA)),但 funs() 函数已被弃用,很快就会从 dplyr 中删除.

Note on syntax changes in dplyr: Prior to dplyr version 0.8.0, you would simply write .funs = funs(ifelse(x, . , NA)), but the funs() function is being deprecated and will soon be removed from dplyr.

这篇关于在 dplyr mutate_at 调用中使用多列的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆