将.SD与重命名的变量组合在一起会导致.SD列名称混乱 [英] Combining .SD with renamed variable messes with names of .SD columns

查看:58
本文介绍了将.SD与重命名的变量组合在一起会导致.SD列名称混乱的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的代码中,我想以编程方式选择一些变量,然后以硬编码方式选择并重命名其他一些变量。我知道可以使用 setnames()分两步实现,但是我很好奇如何一步完成。

Within my code, I would like to programmatically select some variables and select and rename some others in a hard coded way. I know that I could achieve this in two steps with setnames(), yet I am curious how to do it in a single step.

我想我通过 .SDcols 非常接近它。但是,当我尝试将 .SD 与重命名的列组合时, .SDcols 列的前缀为 .SD。

I think I am quite close to it via .SDcols. However, when I try to combine .SD with the renamed column, the ".SDcols columns" are prefixed with ".SD.". How can the prefix be avoided?

library(data.table)
dt <- as.data.table(mtcars)[1:5]
dt
#>     mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> 1: 21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> 2: 21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> 3: 22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> 4: 21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> 5: 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2

my_vars <- c("cyl", "vs")
# with .SDcol
dt[, .(.SD, z = gear), .SDcol = my_vars]
#>    .SD.cyl .SD.vs z    # Note the prefix that had been added to the .SDcols
#> 1:       6      0 4
#> 2:       6      0 4
#> 3:       4      1 4
#> 4:       6      1 3
#> 5:       8      0 3

# with named vector
all_vars <- c(my_vars, z = "gear")
dt[, ..all_vars]
#>    cyl vs gear
#> 1:   6  0    4
#> 2:   6  0    4
#> 3:   4  1    4
#> 4:   6  1    3
#> 5:   8  0    3


推荐答案

我认为这是因为您将 .SD 包装在列表。())中。 list(.SD)生成一个 list 包含 .SD data.table ,而不是 .SD only c>。

I assume this is because you wrap .SD in list (.()). The list(.SD) generates a list containing the .SD data.table, instead of only the .SD. This then messes with the naming.

检查 .SD str $ c>包裹在列表中:

# dt[, str(.(.SD)), .SDcol = my_vars]
# List of 1
# $ :Classes ‘data.table’ and 'data.frame': 5 obs. of  2 variables:
#   ..$ cyl: num [1:5] 6 6 4 6 8
#   ..$ vs : num [1:5] 0 0 1 1 0

相应的输出具有 .SD。前缀:

dt[ , .(.SD), .SDcol = my_vars]
#    .SD.cyl .SD.vs
# 1:       6      0
# 2:       6      0
# 3:       4      1
# 4:       6      1
# 5:       8      0

检查 .SD str 仅$ c>:

Check str of .SD only:

# dt[, str(.SD), .SDcol = my_vars]
# Classes ‘data.table’ and 'data.frame':    5 obs. of  2 variables:
#   $ cyl: num  6 6 4 6 8
#   $ vs : num  0 0 1 1 0






给出 j 的基本属性-只要 j 返回一个列表,列表中的每个元素都成为结果 data.table 中的一列,而 .SD 已经是列表(请检查 dt [,is.list(.SD)] ),我们可以使用 c .SD 与其他列表元素结合起来,例如您重命名的列包装在 list 中:


Given the basic property of j - "As long as j returns a list, each element of the list becomes a column in the resulting data.table" - and that .SD already is a list (check dt[ , is.list(.SD)]), we can use c to combine .SD with other list elements, e.g. your renamed column wrapped in list:

dt[, c(.SD, .(z = gear)), .SDcol = my_vars]

#    cyl vs z
# 1:   6  0 4
# 2:   6  0 4
# 3:   4  1 4
# 4:   6  1 3
# 5:   8  0 3

这篇关于将.SD与重命名的变量组合在一起会导致.SD列名称混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆