在`dplyr`中使用动态变量名 [英] Use dynamic variable names in `dplyr`

查看:174
本文介绍了在`dplyr`中使用动态变量名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用dplyr::mutate()在数据框中创建多个新列.列名及其内容应动态生成.

I want to use dplyr::mutate() to create multiple new columns in a data frame. The column names and their contents should be dynamically generated.

来自虹膜的示例数据:

library(dplyr)
iris <- tbl_df(iris)

我创建了一个函数,用于从Petal.Width变量中对新列进行突变:

I've created a function to mutate my new columns from the Petal.Width variable:

multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    df <- mutate(df, varname = Petal.Width * n)  ## problem arises here
    df
}

现在,我创建一个循环来构建我的列:

Now I create a loop to build my columns:

for(i in 2:5) {
    iris <- multipetal(df=iris, n=i)
}

但是,由于mutate认为varname是一个文字变量名,因此循环仅创建一个新变量(称为varname),而不是四个新变量(称为花瓣2-花瓣5).

However, since mutate thinks varname is a literal variable name, the loop only creates one new variable (called varname) instead of four (called petal.2 - petal.5).

如何获取mutate()以将我的动态名称用作变量名?

How can I get mutate() to use my dynamic name as variable name?

推荐答案

由于您正在动态地将变量名构建为字符值,因此使用标准data.frame索引进行赋值更有意义,该方法允许使用列的字符值名称.例如:

Since you are dynamically building a variable name as a character value, it makes more sense to do assignment using standard data.frame indexing which allows for character values for column names. For example:

multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    df[[varname]] <- with(df, Petal.Width * n)
    df
}

mutate函数使通过命名参数命名新列变得非常容易.但这假设您在键入命令时知道名称.如果要动态指定列名,则还需要构建named参数.

The mutate function makes it very easy to name new columns via named parameters. But that assumes you know the name when you type the command. If you want to dynamically specify the column name, then you need to also build the named argument.

dplyr(0.7)的最新版本通过使用:=动态分配参数名称来执行此操作.您可以将函数编写为:

The latest version of dplyr (0.7) does this using by using := to dynamically assign parameter names. You can write your function as:

# --- dplyr version 0.7+---
multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    mutate(df, !!varname := Petal.Width * n)
}

有关更多信息,请参见表vignette("programming", "dplyr")中可用的文档.

For more information, see the documentation available form vignette("programming", "dplyr").

dplyr的稍早版本(> = 0.3< 0.7),鼓励对许多功能使用标准评估"替代方法.有关更多信息,请参见非标准评估图(vignette("nse")).

Slightly earlier version of dplyr (>=0.3 <0.7), encouraged the use of "standard evaluation" alternatives to many of the functions. See the Non-standard evaluation vignette for more information (vignette("nse")).

所以在这里,答案是使用mutate_()而不是mutate()并执行:

So here, the answer is to use mutate_() rather than mutate() and do:

# --- dplyr version 0.3-0.5---
multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    varval <- lazyeval::interp(~Petal.Width * n, n=n)
    mutate_(df, .dots= setNames(list(varval), varname))
}


dplyr< 0.3

请注意,在最初提出问题时存在的dplyr的较早版本中,这也是可能的.它需要仔细使用quotesetName:


dplyr < 0.3

Note this is also possible in older versions of dplyr that existed when the question was originally posed. It requires careful use of quote and setName:

# --- dplyr versions < 0.3 ---
multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    pp <- c(quote(df), setNames(list(quote(Petal.Width * n)), varname))
    do.call("mutate", pp)
}

这篇关于在`dplyr`中使用动态变量名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆