转换数据帧字符串变量名 [英] transform data frame string variable names

查看:228
本文介绍了转换数据帧字符串变量名的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含日期和id的数据框。我需要根据每个日期为此数据框添加多个列。我使用 ddply 执行如下操作:



ddply(df,dt ,transform,new_column1 = myfun(column_name_1))



但是,我有一堆列名,并希望添加多个新列。有没有办法可以传递一个字符串来转换而不是new_column1?例如,我试过:



ddply(df,dt,transform,get(some_column_name)= myfun(column_name_1) code>



但这不行。另外,如果我把 column_name_1 作为一个字符串传递给myfun,那么我可以使用 get(column_name_1) myfun 以引用列?



更新:
不确定如何形成这个更好的

 输入:
id日期val
id1 d1 1
id2 d1 2
id3 d1 3
id4 d1 4
id1 d2 10
id2 d2 20
id3 d2 30
id4 d2 40

出(例如2个桶)

 code> id日期val桶
id1 d1 1 1
id2 d1 2 1
id3 d1 3 2
id4 d1 4 2
id1 d2 10 1
id2 d2 20 1
id3 d2 30 2
id4 d2 40 2


解决方案

使用转换基本像

  tmpf < -  function(x){
x [[new_column_name_1]]< - myfun [[column_name_1]])
x [[new_column_name_2]]< - myfun(x [[column_name_2]])
...
x
}
ddply(df ,dt,tmpf)

或者你可以有一个列名称向量来修改,或者做它在飞行中:

  tmpf<  -  function(x,cols = c(column_name_1,column_name_2)) {
newcols< - paste(new,cols,sep =_)
for(i in seq_along(cols)){
x [[newcols [i]]] ; - myfun(x [[cols [i]]])
}
}

在适当的环境中, assign 可能会更清楚一些。



如果我有一个可重现的例子,我可以测试这个。


I have a data frame that contains dates and id's. I need to add multiple columns to this data frame based on each date. I use ddply to do this as follows:

ddply(df, "dt", transform, new_column1 = myfun(column_name_1))

However,I have a bunch of column names and would like to add multiple new columns. Is there a way that I can pass a string to transform instead of new_column1? For example I tried:

ddply(df, "dt", transform, get("some_column_name")=myfun(column_name_1))

but this does not work. Additionally, if I pass the column_name_1 to myfun as a string, can I just use get("column_name_1") within myfun to refer to the column?

UPDATE: NOT SURE HOW TO FORMAT THIS BETTER

input:
id    date    val
id1   d1      1
id2   d1      2
id3   d1      3
id4   d1      4
id1   d2      10
id2   d2      20
id3   d2      30
id4   d2      40

out (for 2 buckets for example)

id    date    val     bucket
id1   d1      1         1
id2   d1      2         1
id3   d1      3         2
id4   d1      4         2
id1   d2      10        1
id2   d2      20        1
id3   d2      30        2
id4   d2      40        2

解决方案

Doing it with transform is slick, but why not something more basic like

tmpf <- function(x) {
   x[[new_column_name_1]] <- myfun(x[[column_name_1]])
   x[[new_column_name_2]] <- myfun(x[[column_name_2]])
   ...
   x
}
ddply(df,"dt",tmpf)

Or you can have a vector of column names to modify, or do it on the fly:

tmpf <- function(x,cols=c("column_name_1","column_name_2")) {
   newcols <- paste("new",cols,sep="_")
   for (i in seq_along(cols)) {
      x[[newcols[i]]] <- myfun(x[[cols[i]]])
   }
}

There's probably something even cleverer with assign in the appropriate environment.

If I had a reproducible example I could test this.

这篇关于转换数据帧字符串变量名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆