在dplyr mutate_at调用中使用多列函数 [英] Using functions of multiple columns in a dplyr mutate_at call

查看:92
本文介绍了在dplyr mutate_at调用中使用多列函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用dplyr的 mutate_at 函数将一个函数应用于数据帧中的多个列,其中该函数也会输入直接应用于该列的列

I'd like to use dplyr's mutate_at function to apply a function to several columns in a dataframe, where the function inputs the column to which it is directly applied as well as another column in the dataframe.

作为具体示例,我希望对以下数据框进行突变

As a concrete example, I'd look to mutate the following dataframe

# Example input dataframe
df <- data.frame(
    x = c(TRUE, TRUE, FALSE),
    y = c("Hello", "Hola", "Ciao"),
    z = c("World", "ao", "HaOlam")
)

带有与此类似的 mutate_at 调用

df %>%
mutate_at(.vars = vars(y, z),
          .funs = ifelse(x, ., NA))

返回看起来像这样的数据框

to return a dataframe that looks something like this

# Desired output dataframe
df2 <- data.frame(x = c(TRUE, TRUE, FALSE),
                  y_1 = c("Hello", "Hola", NA),
                  z_1 = c("World", "ao", NA))

所需的 mutate_at 调用类似于以下对 mutate 的调用:

The desired mutate_at call would be similar to the following call to mutate:

df %>%
   mutate(y_1 = ifelse(x, y, NA),
          z_1 = ifelse(x, z, NA))

我知道这可以通过几种方式在base R中完成,但是我特别想使用dplyr的来实现此目标。 mutate_at 函数是为了提高可读性,与数据库接口等。

I know that this can be done in base R in several ways, but I would specifically like to accomplish this goal using dplyr's mutate_at function for the sake of readability, interfacing with databases, etc.

以下是一些关于stackoverflow的类似问题,没有解决我在这里提出的问题:

Below are some similar questions asked on stackoverflow which do not address the question I posed here:

在dplyr mutate调用中添加多个列

dplyr :: mutate以添加多个值

使用dplyr的mutate()函数在sum()函数中使用列

推荐答案

@ eipi10在对问题的评论中,@ eipi10回答了此问题,但我在此写下来是后代

This was answered by @eipi10 in @eipi10's comment on the question, but I'm writing it here for posterity.

此处的解决方案是使用:

The solution here is to use:

df %>%
   mutate_at(.vars = vars(y, z),
             .funs = list(~ ifelse(x, ., NA)))

您还可以将新的 across()函数与 mutate()一起使用,如下所示:

You can also use the new across() function with mutate(), like so:

df %>%
   mutate(across(c(y, z), ~ ifelse(x, ., NA)))

使用公式运算符(如〜ifelse(...))此处表示 ifelse(x,。,NA)是匿名函数,正在调用 mutate_at()时定义

The use of the formula operator (as in ~ ifelse(...)) here indicates that ifelse(x, ., NA) is an anonymous function that is being defined within the call to mutate_at().

这类似于在外部定义函数调用 mutate_at()的方法,例如:

This works similarly to defining the function outside of the call to mutate_at(), like so:

temp_fn <- function(input) ifelse(test = df[["x"]],
                                  yes = input,
                                  no = NA)

df %>%
   mutate_at(.vars = vars(y, z),
             .funs = temp_fn)

注意dplyr中的语法更改::在dplyr 0.8.0版之前,您只需编写 .funs = funs(ifelse(x,。 ,不适用),但 funs()函数已被弃用,并将很快从dplyr中删除。

Note on syntax changes in dplyr: Prior to dplyr version 0.8.0, you would simply write .funs = funs(ifelse(x, . , NA)), but the funs() function is being deprecated and will soon be removed from dplyr.

这篇关于在dplyr mutate_at调用中使用多列函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆