使用data.table中的列名选择多个列范围 [英] Select multiple ranges of columns using column names in data.table
问题描述
假设我有一个数据表,
dt = data.table(matrix(1:50, nrow = 5));
colnames(dt) = letters[1:10];
> dt
a b c d e f g h i j
1: 1 6 11 16 21 26 31 36 41 46
2: 2 7 12 17 22 27 32 37 42 47
3: 3 8 13 18 23 28 33 38 43 48
4: 4 9 14 19 24 29 34 39 44 49
5: 5 10 15 20 25 30 35 40 45 50
我想选择几个不连续的列范围,例如: a
, c:d
, f:h
和 j
。可以通过 dplyr's select()
轻松完成:
I want to select several discontinuous ranges of columns like: a
, c:d
, f:h
and j
. This can be done easily via dplyr's select()
:
dt%>%select(a,c:d,f:h,j)
我正在寻找 data.table
的方法。
I am looking for a data.table
way of achieving the same.
现在,我可以选择列分别以任何顺序: dt [,..(a,c)]
或以<$ c $形式给出列名称的一个序列c> startcol:endcol :
Right now, I can either select columns individually in any order: dt[ , .(a, c)]
or giving just one sequence of column names on the form startcol:endcol
:
dt [,c:f]
但是,我无法结合以上两种方法来在 .SDcols中一次选择几个列范围。 code>,就像我在
dplyr :: select
However, I can't combine the above two methods to select several column ranges in one shot in .SDcols
, like I did in dplyr::select
推荐答案
我们可以使用 .SDcols
中的range部分,然后通过串联
We can use the range part in .SDcols
and then append the other column by concatenating
dt[, c(list(a= a), .SD) , .SDcols = c:d]
如果有多个范围,我们创建一个范围序列s通过 match
,然后获得相应的列名
If there are multiple ranges, we create a sequence of ranges by match
, and then get the corresponding column names
i1 <- match(c("c", "f"), names(dt))
j1 <- match(c("d", "h"), names(dt))
nm1 <- c("a", names(dt)[unlist(Map(`:`, i1, j1))], "j")
dt[, ..nm1]
# a c d f g h j
#1: 1 11 16 26 31 36 46
#2: 2 12 17 27 32 37 47
#3: 3 13 18 28 33 38 48
#4: 4 14 19 29 34 39 49
#5: 5 15 20 30 35 40 50
此外, dplyr
方法可在 data.table
dt[, select(.SD, a, c:d, f:h, j)]
# a c d f g h j
#1: 1 11 16 26 31 36 46
#2: 2 12 17 27 32 37 47
#3: 3 13 18 28 33 38 48
#4: 4 14 19 29 34 39 49
#5: 5 15 20 30 35 40 50
这篇关于使用data.table中的列名选择多个列范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!