R - dplyr - mutate - 使用动态变量名 [英] R - dplyr - mutate - use dynamic variable names

查看:188
本文介绍了R - dplyr - mutate - 使用动态变量名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用 dplyr的 mutate()在数据框中创建多个新列。列名称及其内容应动态生成。



来自虹膜的示例数据:

 code> require(dplyr)
data(iris)
iris< - tbl_df(iris)

我创建了一个函数来从Petal.Width变量中改变我的新列:

 多项式<  -  function(df,n){
varname< - paste(petal,n,sep =。)
df< - mutate(df,varname = Petal.Width * n)##这里出现问题
df
}

现在我创建一个循环来构建我的列:

  for(i in 2:5){
iris< - multipetal df = iris,n = i)
}

然而,由于mutate认为varname是文字变量名称,循环仅创建一个新变量(称为varname)而不是四个(称为petal.2 - petal.5)。



如何获取 mutate()将动态名称用作变量名称?

解决方案

由于您正在大幅度地构建一个变量名作为字符值,所以使用标准的数据框架索引进行分配更有意义,该索引允许列名称的字符值。例如

  multipetal<  -  function(df,n){
varname< - paste(petal ,n,sep =。)
df [[varname]]< - with(df,Petal.Width * n)
df
}

mutate 函数通过命名参数命名新列非常简单。但是,假设您在输入命令时知道该名称。如果要动态指定列名,那么还需要构建命名参数。



最新版本的dplyr(0.7)通过使用:= 来动态分配参数名称。您可以将您的功能写入

 #--- dplyr version 0.7 + --- 
multipetal< - function (df,n){
varname< - paste(petal,n,sep =。)
mutate(df,!! varname:= Petal.Width * n)

有关更多信息,请参阅可用的文档 vignette(programming ,dplyr)



早期版本的dplyr(> = 0.3 <0.7),鼓励使用标准评估替代许多功能。有关更多信息,请参阅非标准评估小插曲( vignette(nse))。所以这里的答案是使用 mutate _()而不是 mutate() 并执行

 #--- dplyr版本0.3-0.5 --- 
多项式< - function(df,n){
varname< - paste(petal,n,sep =。)
varval< - lazyeval :: interp(〜Petal.Width * n,n = n)
mutate_(df,.dots = setNames(list(varval),varname))
}

旧版本的dplyr



请注意,在旧版本的dplyr中,这个问题最初是提出来的。它需要仔细使用报价 setName

 #--- dplyr versions< 0.3 --- 
多项< - 函数(df,n){
varname< - paste(petal,n,sep =。)
pp < (quote(df),setNames(list(quote(Petal.Width * n)),varname))
do.call(mutate,pp)
}


I want to use dplyr's mutate() to create multiple new columns in a data frame. The column names and their contents should be dynamically generated.

Example data from iris:

require(dplyr)
data(iris)
iris <- tbl_df(iris)

I've created a function to mutate my new columns from the Petal.Width variable:

multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    df <- mutate(df, varname = Petal.Width * n)  ## problem arises here
    df
}

Now I create a loop to build my columns:

for(i in 2:5) {
    iris <- multipetal(df=iris, n=i)
}

However, since mutate thinks varname is a literal variable name, the loop only creates one new variable (called varname) instead of four (called petal.2 - petal.5).

How can I get mutate() to use my dynamic name as variable name?

解决方案

Since you are dramatically building a variable name as a character value, it makes more sense to do assignment using standard data.frame indexing which allows for character values for column names. For example

multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    df[[varname]] <- with(df, Petal.Width * n)
    df
}

The mutate function makes it very easy to name new columns via named parameters. But that assumes you know the name when you type the command. If you want to dynamically specify the column name, then you need to also build the named argument.

The latest version of dplyr (0.7) does this using by using := to dynamically assign parameter names. You can write your function as

# --- dplyr version 0.7+---
multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    mutate(df, !!varname := Petal.Width * n)
}

For more information, see the documentation available form vignette("programming", "dplyr").

Slightly earlier version of dplyr (>=0.3 <0.7), encouraged the use of "standard evaluation" alternatives to many of the functions. See the Non-standard evaluation vignette for more information (vignette("nse")).

So here, the answer is to use mutate_() rather than mutate() and do

# --- dplyr version 0.3-0.5---
multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    varval <- lazyeval::interp(~Petal.Width * n, n=n)
    mutate_(df, .dots= setNames(list(varval), varname))
}

Older versions of dplyr

Note this is also possible in older versions of dplyr that existed when the question was originally posed. It requires careful use of quote and setName:

# --- dplyr versions < 0.3 ---
multipetal <- function(df, n) {
    varname <- paste("petal", n , sep=".")
    pp <- c(quote(df), setNames(list(quote(Petal.Width * n)), varname))
    do.call("mutate", pp)
}

这篇关于R - dplyr - mutate - 使用动态变量名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆