R data.table计算新列,但在开头插入 [英] R data.table compute new column, but insert at beginning

查看:93
本文介绍了R data.table计算新列,但在开头插入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R data.table s中,我可以使用以下语法添加新列:

In R data.tables, I can use this syntax to add a new column:

> dt <- data.table(a=c(1,2), b=c(3,4))
> dt[, c := a + b]
> dt
   a b c
1: 1 3 4
2: 2 4 6

但是我如何像这样将c插入dt的前面:

But how would I insert c at the front of the dt like so:

   c a b
1: 4 1 3
2: 6 2 4

我看了看SO,发现有人建议 cbind 用于 data.frame ,但是使用对我来说更方便:= 语法,因此我想知道是否存在 data.table 认可的方法。我的 data.table 大约有100列,所以我不想将它们全部列出。

I looked on SO, and found some people suggesting cbind for data.frames, but it's more convenient for me to use the := syntax here, so I was wondering if there was a data.table sanctioned way of doing this. My data.table has around 100 columns, so I don't want to list them all out.

推荐答案


更新:此功能现已合并到 data.table 的最新CRAN版本中(从v1开始) .11.0),因此使用该功能不再需要安装开发版本。从发行说明中:

Update: This feature has now been merged into the latest CRAN version of data.table (starting with v1.11.0), so installing the development version is no longer necessary to use this feature. From the release notes:


  1. setcolorder()现在接受的行数少于ncol(DT)的列要移到最前面,即#592。谢谢@MichaelChirico的公关。


data.table (v1.10.5)对 setcolorder()进行了更新,通过接受部分列列表使此方法更加方便。首先放置提供的列,然后按现有顺序添加所有未指定的列。

Current development version of data.table (v1.10.5) has updates to setcolorder() that make this way more convenient by accepting a partial list of columns. The columns provided are placed first, and then all non-specified columns are added after in the existing order.

此处提供了开发分支的安装说明。

有关开发分支稳定性的说明:我一直在为它运行现在需要几个月才能在v1.10.5的 fread()中使用多线程版本(如果要处理多GB的.csv文件,那么值得进行更新)和我还没有发现任何错误或使用回归。

Note regarding development branch stability: I've been running it for several months now to utilize the multi-threaded version in fread() in v1.10.5 (that alone is worth the update if you deal with multi-GB .csv files) and I have not noticed any bugs or regressions for my usage.

library(data.table)
DT <- as.data.table(mtcars)
DT[1:5]

给予

    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2

重新排序基于部分列表的列:

re-order columns based on a partial list:

setcolorder(DT,c("gear","carb"))
DT[1:5]

现在给出

   gear carb  mpg cyl disp  hp drat    wt  qsec vs am
1:    4    4 21.0   6  160 110 3.90 2.620 16.46  0  1
2:    4    4 21.0   6  160 110 3.90 2.875 17.02  0  1
3:    4    1 22.8   4  108  93 3.85 2.320 18.61  1  1
4:    3    1 21.4   6  258 110 3.08 3.215 19.44  1  0
5:    3    2 18.7   8  360 175 3.15 3.440 17.02  0  0




如果出于某种原因您不想更新到开发分支,则可以在


If for any reason you don't want to update to the development branch, the following works in previous (and current CRAN) versions.

newCols <- c("gear","carb")
setcolorder(DT,c(newCols, setdiff(newCols,colnames(DT)) ## (Per Frank's advice in comments)

## the long way I'd always done before seeing setdiff()
## setcolorder(DT,c(newCols,colnames(DT)[which(!colnames(DT) %in% newCols)]))

这篇关于R data.table计算新列,但在开头插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆