R data.table 计算新列,但在开头插入 [英] R data.table compute new column, but insert at beginning

查看:13
本文介绍了R data.table 计算新列,但在开头插入的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在 R data.tables 中,我可以使用这种语法来添加一个新列:

In R data.tables, I can use this syntax to add a new column:

> dt <- data.table(a=c(1,2), b=c(3,4))
> dt[, c := a + b]
> dt
   a b c
1: 1 3 4
2: 2 4 6

但是我如何像这样在 dt 的前面插入 c:

But how would I insert c at the front of the dt like so:

   c a b
1: 4 1 3
2: 6 2 4

我查看了SO,发现有人建议data.frames使用cbind,但我使用:=更方便code> 语法在这里,所以我想知道是否有 data.table 认可的方式来执行此操作.我的 data.table 大约有 100 列,所以我不想一一列出.

I looked on SO, and found some people suggesting cbind for data.frames, but it's more convenient for me to use the := syntax here, so I was wondering if there was a data.table sanctioned way of doing this. My data.table has around 100 columns, so I don't want to list them all out.

推荐答案

更新:此功能现已合并到最新的 CRAN 版本的 data.table(从 v1.11.0 开始),因此不再需要安装开发版本来使用此功能.来自发行说明:

Update: This feature has now been merged into the latest CRAN version of data.table (starting with v1.11.0), so installing the development version is no longer necessary to use this feature. From the release notes:

  1. setcolorder() 现在接受将小于 ncol(DT) 的列移到前面,#592.感谢 @MichaelChirico 的 PR.

data.table (v1.10.5) 的当前开发版本对 setcolorder() 进行了更新,通过接受部分列列表使这种方式更加方便.提供的列先放置,所有未指定的列按现有顺序添加到其后.

Current development version of data.table (v1.10.5) has updates to setcolorder() that make this way more convenient by accepting a partial list of columns. The columns provided are placed first, and then all non-specified columns are added after in the existing order.

这里的开发分支安装说明.

关于开发分支稳定性的注意事项:我已经运行了几个月,以利用 v1.10.5 中 fread() 中的多线程版本(仅此一项就值得如果您处理多 GB .csv 文件,请更新),我没有注意到我的使用有任何错误或回归.

Note regarding development branch stability: I've been running it for several months now to utilize the multi-threaded version in fread() in v1.10.5 (that alone is worth the update if you deal with multi-GB .csv files) and I have not noticed any bugs or regressions for my usage.

library(data.table)
DT <- as.data.table(mtcars)
DT[1:5]

给予

    mpg cyl disp  hp drat    wt  qsec vs am gear carb
1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2

根据部分列表对列重新排序:

re-order columns based on a partial list:

setcolorder(DT,c("gear","carb"))
DT[1:5]

现在给

   gear carb  mpg cyl disp  hp drat    wt  qsec vs am
1:    4    4 21.0   6  160 110 3.90 2.620 16.46  0  1
2:    4    4 21.0   6  160 110 3.90 2.875 17.02  0  1
3:    4    1 22.8   4  108  93 3.85 2.320 18.61  1  1
4:    3    1 21.4   6  258 110 3.08 3.215 19.44  1  0
5:    3    2 18.7   8  360 175 3.15 3.440 17.02  0  0


如果出于某种原因您不想更新到开发分支,以下适用于以前(和当前的 CRAN)版本.


If for any reason you don't want to update to the development branch, the following works in previous (and current CRAN) versions.

newCols <- c("gear","carb")
setcolorder(DT,c(newCols, setdiff(newCols,colnames(DT)) ## (Per Frank's advice in comments)

## the long way I'd always done before seeing setdiff()
## setcolorder(DT,c(newCols,colnames(DT)[which(!colnames(DT) %in% newCols)]))

这篇关于R data.table 计算新列,但在开头插入的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆