.SD在data.table中代表什么 [英] What does .SD stand for in data.table in R

查看:317
本文介绍了.SD在data.table中代表什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

.SD 看起来很有用,但我真的不知道我在做什么。它代表什么?为什么有前一个时段(全停)。我使用它时发生了什么?

.SD looks useful but I do not really know what I am doing with it. What does it stand for? Why is there a preceding period (full stop). What is happening when I use it?

我阅读


.SD 是包含 x 的子集的 data.table 每个组的数据,不包括组列。当通过进行分组时, i 分组时,可以使用 code>和 ad hoc

.SD is a data.table containing the subset of x's data for each group, excluding the group column(s). It can be used when grouping by i, when grouping by by, keyed by, and ad hoc by

这是否意味着女儿 data.table 在内存中保存下一个操作?

Does that mean that the daughter data.tables are held in memory for the next operation?

推荐答案

.SD 代表类似 S ubset D ata.table。对于最初的没有意义,除了它使得用户定义的列名不太可能会发生冲突。

.SD stands for something like "Subset of Data.table". There's no significance to the initial ".", except that it makes it even more unlikely that there will be a clash with a user-defined column name.

如果这是您的data.table:

If this is your data.table:

DT = data.table(x=rep(c("a","b","c"),each=2), y=c(1,3), v=1:6)
setkey(DT, y)
DT
#      y x v
# [1,] 1 a 1
# [2,] 1 b 3
# [3,] 1 c 5
# [4,] 3 a 2
# [5,] 3 b 4
# [6,] 3 c 6

这样做可以帮助您查看 .SD 是:

Doing this may help you see what .SD is:

DT[, .SD[,paste(x,v, sep="", collapse="_")], by=y]
#      y       V1
# [1,] 1 a1_b3_c5
# [2,] 3 a2_b4_c6

基本上, by = y 语句将原始data.table分成这两个子 - data.tables / p>

Basically, the by=y statement breaks the original data.table into these two sub-data.tables

DT[,print(.SD),by=y]
     x v   # 1st sub-data.table, called '.SD' while it's being operated on
[1,] a 1
[2,] b 3
[3,] c 5
     x v   # 2nd sub-data.table, ALSO called '.SD' while it's being operated on
[1,] a 2
[2,] b 4
[3,] c 6

并依次操作。

当它在任何一个操作时,它允许你通过使用引用当前的子 data.table nick-name / handle /符号 .SD 。这很方便,你可以访问和操作列,就像你坐在命令行使用一个单一的data.table .SD ... except except在这里, data.table 将对每个单独的子 - data.table 执行这些操作,键,将它们粘贴回来并将结果返回单个 data.table

While it is operating on either one, it lets you refer to the current sub-data.table by using the nick-name/handle/symbol .SD. That's very handy, as you can access and operate on the columns just as if you were sitting at the command line working with a single data.table called .SD ... except that here, data.table will carry out those operations on every single sub-data.table defined by combinations of the key, "pasting" them back together and returning the results in a single data.table!

这篇关于.SD在data.table中代表什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆