在data.table中动态创建新列 [英] Dynamically create new columns in data.table
问题描述
我在R中有一个data.table,想要创建一个新列.假设我将日期列名称保存为变量,并想在新列中将 _year
附加到该名称.我可以通过指定名称来执行常规操作,但是如何使用 date_col
变量创建新的列名称.
I have a data.table in R and want to create a new column. Let's say that I have the date column name saved as a variable and want to append _year
to that name in the new column. I'm able to do that the normal route by just specifying the name, but how can I create the new column name using the date_col
variable.
这是我尝试过的.我要的最后两个不起作用.
Here is what I've tried. The last two, which I want, don't work.
dat = data.table(one = 1:5, two = 1:5,
order_date = lubridate::ymd("2015-01-01","2015-02-01","2015-03-01",
"2015-04-01","2015-05-01"))
dat
date_col = "order_date"
dat[,`:=`(OrderDate_year = substr(get(date_col)[!is.na(get(date_col))],1,4))][]
dat[,`:=`(new = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
dat[,`:=`(paste0(date_col, "_year", sep="") = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
推荐答案
最后两个语句返回错误消息:
The last two statements return an error message:
dat[,`:=`(paste0(date_col, "_year", sep="") = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
Error: unexpected '=' in "dat[,`:=`(paste0(date_col, "_year", sep="") ="
dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
Error: unexpected '=' in "dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) ="
调用:=()
函数的正确语法是:
The correct syntax for calling the :=()
function is:
dat[, `:=`(paste0(date_col, "_year", sep = ""),
substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]
dat[, `:=`(noquote(paste0(date_col, "_year", sep = "")),
substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]
,即用,
替换 =
.
但是,赋值语法和右侧太复杂了.
However, assignment syntax and right hand side are far too complicated.
order_date
列已为 Date
类:
str(dat)
Classes ‘data.table’ and 'data.frame': 5 obs. of 3 variables:
$ one : int 1 2 3 4 5
$ two : int 1 2 3 4 5
$ order_date: Date, format: "2015-01-01" "2015-02-01" ...
- attr(*, ".internal.selfref")=<externalptr>
为了提取年份,可以使用 year()
函数(从 data.table
包或 lubridate
包中使用)无论最后加载的是什么),因此无需转换回字符并提取年份字符串:
In order to extract the year, year()
function can be used (either from the data.table
package or the lubridate
package whatever is loaded last), so no conversion back to character and extraction of the year string is required:
date_col = "order_date"
dat[, paste0(date_col, "_year") := lapply(.SD, year), .SDcols = date_col][]
one two order_date order_date_year
1: 1 1 2015-01-01 2015
2: 2 2 2015-02-01 2015
3: 3 3 2015-03-01 2015
4: 4 4 2015-04-01 2015
5: 5 5 2015-05-01 2015
或者,
dat[, paste0(date_col, "_year") := year(get(date_col))][]
dat[, `:=`(paste0(date_col, "_year"), year(get(date_col)))][]
也要工作.
这篇关于在data.table中动态创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!