.SD 在 R 中的 data.table 中代表什么 [英] What does .SD stand for in data.table in R

查看:20
本文介绍了.SD 在 R 中的 data.table 中代表什么的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

.SD 看起来很有用,但我真的不知道我在用它做什么.它代表什么?为什么会有前一段(句号).当我使用它时会发生什么?

.SD looks useful but I do not really know what I am doing with it. What does it stand for? Why is there a preceding period (full stop). What is happening when I use it?

我读到:.SD 是一个 data.table,其中包含每个组的 x 数据的子集,不包括组列.它可以在按i分组时使用,按by分组时使用,keyed by和_ad hoc_ by

I read: .SD is a data.table containing the subset of x's data for each group, excluding the group column(s). It can be used when grouping by i, when grouping by by, keyed by, and _ad hoc_ by

这是否意味着子 data.tables 被保存在内存中以供下一次操作使用?

Does that mean that the daughter data.tables is held in memory for the next operation?

推荐答案

.SD 代表类似Subset of Data.桌子".初始的 "." 没有任何意义,只是它更不可能与用户定义的列名发生冲突.

.SD stands for something like "Subset of Data.table". There's no significance to the initial ".", except that it makes it even more unlikely that there will be a clash with a user-defined column name.

如果这是你的 data.table:

If this is your data.table:

DT = data.table(x=rep(c("a","b","c"),each=2), y=c(1,3), v=1:6)
setkey(DT, y)
DT
#    x y v
# 1: a 1 1
# 2: b 1 3
# 3: c 1 5
# 4: a 3 2
# 5: b 3 4
# 6: c 3 6

这样做可能会帮助您了解 .SD 是什么:

Doing this may help you see what .SD is:

DT[ , .SD[ , paste(x, v, sep="", collapse="_")], by=y]
#    y       V1
# 1: 1 a1_b3_c5
# 2: 3 a2_b4_c6

基本上,by=y 语句将原始 data.table 分解为这两个子 data.tables

Basically, the by=y statement breaks the original data.table into these two sub-data.tables

DT[ , print(.SD), by=y]
# <1st sub-data.table, called '.SD' while it's being operated on>
#    x v
# 1: a 1
# 2: b 3
# 3: c 5
# <2nd sub-data.table, ALSO called '.SD' while it's being operated on>
#    x v
# 1: a 2
# 2: b 4
# 3: c 6
# <final output, since print() doesn't return anything>
# Empty data.table (0 rows) of 1 col: y

并依次对它们进行操作.

and operates on them in turn.

当它在任何一个上运行时,它允许您通过使用昵称/句柄/符号 .SD 来引用当前的子data.table.这非常方便,因为您可以访问和操作列,就像您坐在命令行上使用名为 .SD 的单个 data.table 一样......除了这里,data.table 将对由键组合定义的每个子 data.table 执行这些操作,将它们粘贴"回一起并在单个 中返回结果数据表

While it is operating on either one, it lets you refer to the current sub-data.table by using the nick-name/handle/symbol .SD. That's very handy, as you can access and operate on the columns just as if you were sitting at the command line working with a single data.table called .SD ... except that here, data.table will carry out those operations on every single sub-data.table defined by combinations of the key, "pasting" them back together and returning the results in a single data.table!

这篇关于.SD 在 R 中的 data.table 中代表什么的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆