用单个行替换data.table中的行集 [英] Replace sets of rows in a data.table with a single row
问题描述
我有以下数据.表
DT<- data.table(id=c(1,1,1,1,2,2,2,2),
place = c("a","b","c","d","a","b","d","e"),
seq = c(1,2,3,4,1,2,3,4))
setkey(DT,id)
data.table按id和seq排序:
The data.table is ordered by id and seq:
setorder(DT,id,seq)
对于每个id,我想查找序列b,c,d,如果有这种情况,我想用单行替换b和c的行,比如说z(保留数据其他列与带有a)的行一样.
For every id, I want to look for the sequence b,c,d and if there is such a thing, I want to replace the rows with b and c with a single row, let's say z (keeping the data of the other columns like in the row with the a).
因此,在这种情况下,新的data.table应该是
So in this case the new data.table should be
DT.tobe<- data.table(id=c(1,1,1,2,2,2,2),
place = c("a","z","d","a","b","d","e"),
seq = c(1,2,4,1,2,3,4))
> DT.tobe
id place seq
1: 1 a 1
2: 1 z 2
3: 1 d 4
4: 2 a 1
5: 2 b 2
6: 2 d 3
7: 2 e 4
我不得不说我不知道该怎么做...我也可以接受data.frame解决方案的答案!
I have to say that I have no idea what to try... I could accept answers with data.frame solutions too!
推荐答案
res = setkey(DT[, {
w = setDT(shift(place, 0:2, type="lead"))[.("b","c","d"), on=.(V1,V2,V3), which=TRUE, nomatch=0]
if (length(w)){
w2 = c(w, w + 1L)
rbind(
.SD[-w2],
copy(.SD[w])[, place := "z"]
)
} else .SD
}, by=id], id, seq)
给出
id place seq
1: 1 a 1
2: 1 z 2
3: 1 d 4
4: 2 a 1
5: 2 b 2
6: 2 d 3
7: 2 e 4
使用对序列b,c,d的连接找到位置w.从那里,我们确定要删除的行(w加上其后的一行);保留哪些行(w);以及要在其中进行哪些修改(位置:="z").
Positions w are found using a join against the sequence b, c, d. From there, we identify which rows to drop (w plus the one after it); which rows to keep (w); and what to modify in them (place := "z").
可以对此进行概括的方向太多了,因此,如果出现更复杂的情况,最好只发布一个新问题.
There are too many different directions in which this might be generalized, so probably better to just post a new question if a more complicated case comes up.
这篇关于用单个行替换data.table中的行集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!