转换数据框字符串变量名称 [英] transform data frame string variable names
问题描述
我有一个包含日期和 ID 的数据框.我需要根据每个日期向此数据框中添加多列.我使用 ddply
来做到这一点如下:
I have a data frame that contains dates and id's. I need to add multiple columns to this data frame based on each date. I use ddply
to do this as follows:
ddply(df, "dt", transform, new_column1 = myfun(column_name_1))
但是,我有一堆列名,想添加多个新列.有没有办法可以传递一个字符串来转换而不是 new_column1?例如我试过:
However,I have a bunch of column names and would like to add multiple new columns. Is there a way that I can pass a string to transform instead of new_column1? For example I tried:
ddply(df, "dt", transform, get("some_column_name")=myfun(column_name_1))
但这不起作用.此外,如果我将 column_name_1
作为字符串传递给 myfun,我是否可以在 myfun
中使用 get("column_name_1")
来引用列?
but this does not work. Additionally, if I pass the column_name_1
to myfun as a string, can I just use get("column_name_1")
within myfun
to refer to the column?
更新:不知道如何更好地格式化
UPDATE: NOT SURE HOW TO FORMAT THIS BETTER
input:
id date val
id1 d1 1
id2 d1 2
id3 d1 3
id4 d1 4
id1 d2 10
id2 d2 20
id3 d2 30
id4 d2 40
输出(例如 2 个存储桶)
out (for 2 buckets for example)
id date val bucket
id1 d1 1 1
id2 d1 2 1
id3 d1 3 2
id4 d1 4 2
id1 d2 10 1
id2 d2 20 1
id3 d2 30 2
id4 d2 40 2
推荐答案
用 transform
做这件事很漂亮,但为什么不做一些更基本的事情,比如
Doing it with transform
is slick, but why not something more basic like
tmpf <- function(x) {
x[[new_column_name_1]] <- myfun(x[[column_name_1]])
x[[new_column_name_2]] <- myfun(x[[column_name_2]])
...
x
}
ddply(df,"dt",tmpf)
或者你可以修改一个列名向量,或者即时修改:
Or you can have a vector of column names to modify, or do it on the fly:
tmpf <- function(x,cols=c("column_name_1","column_name_2")) {
newcols <- paste("new",cols,sep="_")
for (i in seq_along(cols)) {
x[[newcols[i]]] <- myfun(x[[cols[i]]])
}
}
在适当的环境中使用 assign
可能会更聪明.
There's probably something even cleverer with assign
in the appropriate environment.
如果我有一个可重现的例子,我可以测试这个.
If I had a reproducible example I could test this.
这篇关于转换数据框字符串变量名称的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!