.SD在data.table中代表什么 [英] What does .SD stand for in data.table in R
问题描述
.SD
看起来很有用,但我真的不知道我在做什么。它代表什么?为什么有前一个时段(全停)。我使用它时发生了什么?
.SD
looks useful but I do not really know what I am doing with it. What does it stand for? Why is there a preceding period (full stop). What is happening when I use it?
我阅读
.SD
是包含x
的子集的data.table
每个组的数据,不包括组列。当通过
进行分组时,i 分组时,可以使用
code>和 ad hoc
由
.SD
is adata.table
containing the subset ofx
's data for each group, excluding the group column(s). It can be used when grouping byi
, when grouping byby
, keyedby
, and ad hocby
这是否意味着女儿 data.table
在内存中保存下一个操作?
Does that mean that the daughter data.table
s are held in memory for the next operation?
推荐答案
.SD
代表类似 S
ubset D
ata.table。对于最初的。
没有意义,除了它使得用户定义的列名不太可能会发生冲突。
.SD
stands for something like "S
ubset of D
ata.table". There's no significance to the initial "."
, except that it makes it even more unlikely that there will be a clash with a user-defined column name.
如果这是您的data.table:
If this is your data.table:
DT = data.table(x=rep(c("a","b","c"),each=2), y=c(1,3), v=1:6)
setkey(DT, y)
DT
# y x v
# [1,] 1 a 1
# [2,] 1 b 3
# [3,] 1 c 5
# [4,] 3 a 2
# [5,] 3 b 4
# [6,] 3 c 6
这样做可以帮助您查看 .SD
是:
Doing this may help you see what .SD
is:
DT[, .SD[,paste(x,v, sep="", collapse="_")], by=y]
# y V1
# [1,] 1 a1_b3_c5
# [2,] 3 a2_b4_c6
基本上, by = y
语句将原始data.table分成这两个子 - data.tables
/ p>
Basically, the by=y
statement breaks the original data.table into these two sub-data.tables
DT[,print(.SD),by=y]
x v # 1st sub-data.table, called '.SD' while it's being operated on
[1,] a 1
[2,] b 3
[3,] c 5
x v # 2nd sub-data.table, ALSO called '.SD' while it's being operated on
[1,] a 2
[2,] b 4
[3,] c 6
并依次操作。
当它在任何一个操作时,它允许你通过使用引用当前的子 data.table
nick-name / handle /符号 .SD
。这很方便,你可以访问和操作列,就像你坐在命令行使用一个单一的data.table .SD
... except except在这里, data.table
将对每个单独的子 - data.table
执行这些操作,键,将它们粘贴回来并将结果返回单个 data.table
!
While it is operating on either one, it lets you refer to the current sub-data.table
by using the nick-name/handle/symbol .SD
. That's very handy, as you can access and operate on the columns just as if you were sitting at the command line working with a single data.table called .SD
... except that here, data.table
will carry out those operations on every single sub-data.table
defined by combinations of the key, "pasting" them back together and returning the results in a single data.table
!
这篇关于.SD在data.table中代表什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!