将.SD与重命名的变量组合在一起会导致.SD列名称混乱 [英] Combining .SD with renamed variable messes with names of .SD columns
问题描述
在我的代码中,我想以编程方式选择一些变量,然后以硬编码方式选择并重命名其他一些变量。我知道可以使用 setnames()
分两步实现,但是我很好奇如何一步完成。
Within my code, I would like to programmatically select some variables and select and rename some others in a hard coded way. I know that I could achieve this in two steps with setnames()
, yet I am curious how to do it in a single step.
我想我通过 .SDcols
非常接近它。但是,当我尝试将 .SD
与重命名的列组合时, .SDcols
列的前缀为 .SD。
I think I am quite close to it via .SDcols
. However, when I try to combine .SD
with the renamed column, the ".SDcols
columns" are prefixed with ".SD.". How can the prefix be avoided?
library(data.table)
dt <- as.data.table(mtcars)[1:5]
dt
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> 1: 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> 2: 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
#> 3: 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
#> 4: 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
#> 5: 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
my_vars <- c("cyl", "vs")
# with .SDcol
dt[, .(.SD, z = gear), .SDcol = my_vars]
#> .SD.cyl .SD.vs z # Note the prefix that had been added to the .SDcols
#> 1: 6 0 4
#> 2: 6 0 4
#> 3: 4 1 4
#> 4: 6 1 3
#> 5: 8 0 3
# with named vector
all_vars <- c(my_vars, z = "gear")
dt[, ..all_vars]
#> cyl vs gear
#> 1: 6 0 4
#> 2: 6 0 4
#> 3: 4 1 4
#> 4: 6 1 3
#> 5: 8 0 3
推荐答案
我认为这是因为您将 .SD
包装在列表
(。()
)中。 list(.SD)
生成一个 list
包含 .SD
data.table
,而不是 .SD $ c $的 only c>。
I assume this is because you wrap .SD
in list
(.()
). The list(.SD)
generates a list
containing the .SD
data.table
, instead of only the .SD
. This then messes with the naming.
检查 .SD $ c的
str
$ c>包裹在列表
中:
# dt[, str(.(.SD)), .SDcol = my_vars]
# List of 1
# $ :Classes ‘data.table’ and 'data.frame': 5 obs. of 2 variables:
# ..$ cyl: num [1:5] 6 6 4 6 8
# ..$ vs : num [1:5] 0 0 1 1 0
相应的输出具有 .SD。
前缀:
dt[ , .(.SD), .SDcol = my_vars]
# .SD.cyl .SD.vs
# 1: 6 0
# 2: 6 0
# 3: 4 1
# 4: 6 1
# 5: 8 0
检查 .SD $ c的
str
仅$ c>:
Check str
of .SD
only:
# dt[, str(.SD), .SDcol = my_vars]
# Classes ‘data.table’ and 'data.frame': 5 obs. of 2 variables:
# $ cyl: num 6 6 4 6 8
# $ vs : num 0 0 1 1 0
给出 j
的基本属性-只要 j
返回一个列表,列表中的每个元素都成为结果 data.table
中的一列,而 .SD
已经是列表
(请检查 dt [,is.list(.SD)]
),我们可以使用 c
将 .SD
与其他列表元素结合起来,例如您重命名的列包装在 list
中:
Given the basic property of j
- "As long as j
returns a list, each element of the list becomes a column in the resulting data.table
" - and that .SD
already is a list
(check dt[ , is.list(.SD)]
), we can use c
to combine .SD
with other list elements, e.g. your renamed column wrapped in list
:
dt[, c(.SD, .(z = gear)), .SDcol = my_vars]
# cyl vs z
# 1: 6 0 4
# 2: 6 0 4
# 3: 4 1 4
# 4: 6 1 3
# 5: 8 0 3
这篇关于将.SD与重命名的变量组合在一起会导致.SD列名称混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!