在data.table中动态创建新列 [英] Dynamically create new columns in data.table

查看:36
本文介绍了在data.table中动态创建新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中有一个data.table,想要创建一个新列.假设我将日期列名称保存为变量,并想在新列中将 _year 附加到该名称.我可以通过指定名称来执行常规操作,但是如何使用 date_col 变量创建新的列名称.

I have a data.table in R and want to create a new column. Let's say that I have the date column name saved as a variable and want to append _year to that name in the new column. I'm able to do that the normal route by just specifying the name, but how can I create the new column name using the date_col variable.

这是我尝试过的.我要的最后两个不起作用.

Here is what I've tried. The last two, which I want, don't work.

dat = data.table(one = 1:5, two = 1:5, 
                 order_date = lubridate::ymd("2015-01-01","2015-02-01","2015-03-01",
                           "2015-04-01","2015-05-01"))
dat
date_col = "order_date"
dat[,`:=`(OrderDate_year = substr(get(date_col)[!is.na(get(date_col))],1,4))][]
dat[,`:=`(new = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
dat[,`:=`(paste0(date_col, "_year", sep="") = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]
dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]

推荐答案

最后两个语句返回错误消息:

The last two statements return an error message:

dat[,`:=`(paste0(date_col, "_year", sep="") = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]

Error: unexpected '=' in "dat[,`:=`(paste0(date_col, "_year", sep="") ="

dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) = substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))],1,4))][]

Error: unexpected '=' in "dat[,`:=`(noquote(paste0(date_col, "_year", sep="")) ="

调用:=()函数的正确语法是:

The correct syntax for calling the :=() function is:

dat[, `:=`(paste0(date_col, "_year", sep = ""), 
           substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]
dat[, `:=`(noquote(paste0(date_col, "_year", sep = "")), 
           substr(noquote(get(date_col))[!is.na(noquote(get(date_col)))], 1, 4))][]

,即用替换 = .

但是,赋值语法和右侧太复杂了.

However, assignment syntax and right hand side are far too complicated.

order_date 列已为 Date 类:

str(dat)

Classes ‘data.table’ and 'data.frame':    5 obs. of  3 variables:
 $ one       : int  1 2 3 4 5
 $ two       : int  1 2 3 4 5
 $ order_date: Date, format: "2015-01-01" "2015-02-01" ...
 - attr(*, ".internal.selfref")=<externalptr>

为了提取年份,可以使用 year()函数(从 data.table 包或 lubridate 包中使用)无论最后加载的是什么),因此无需转换回字符并提取年份字符串:

In order to extract the year, year() function can be used (either from the data.table package or the lubridate package whatever is loaded last), so no conversion back to character and extraction of the year string is required:

date_col = "order_date"
dat[, paste0(date_col, "_year") := lapply(.SD, year), .SDcols = date_col][]

   one two order_date order_date_year
1:   1   1 2015-01-01            2015
2:   2   2 2015-02-01            2015
3:   3   3 2015-03-01            2015
4:   4   4 2015-04-01            2015
5:   5   5 2015-05-01            2015

或者,

dat[, paste0(date_col, "_year") := year(get(date_col))][]
dat[, `:=`(paste0(date_col, "_year"), year(get(date_col)))][]

也要工作.

这篇关于在data.table中动态创建新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆