用单个行替换data.table中的行集 [英] Replace sets of rows in a data.table with a single row

查看:52
本文介绍了用单个行替换data.table中的行集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据.表

 DT<- data.table(id=c(1,1,1,1,2,2,2,2),
            place = c("a","b","c","d","a","b","d","e"),
            seq = c(1,2,3,4,1,2,3,4))
 setkey(DT,id)   

data.table按id和seq排序:

The data.table is ordered by id and seq:

setorder(DT,id,seq)

对于每个id,我想查找序列b,c,d,如果有这种情况,我想用单行替换b和c的行,比如说z(保留数据其他列与带有a)的行一样.

For every id, I want to look for the sequence b,c,d and if there is such a thing, I want to replace the rows with b and c with a single row, let's say z (keeping the data of the other columns like in the row with the a).

因此,在这种情况下,新的data.table应该是

So in this case the new data.table should be

DT.tobe<- data.table(id=c(1,1,1,2,2,2,2),
                     place = c("a","z","d","a","b","d","e"),
                     seq = c(1,2,4,1,2,3,4))
> DT.tobe
   id place seq
1:  1     a   1
2:  1     z   2
3:  1     d   4
4:  2     a   1
5:  2     b   2
6:  2     d   3
7:  2     e   4

我不得不说我不知道​​该怎么做...我也可以接受data.frame解决方案的答案!

I have to say that I have no idea what to try... I could accept answers with data.frame solutions too!

推荐答案

res = setkey(DT[, {
  w = setDT(shift(place, 0:2, type="lead"))[.("b","c","d"), on=.(V1,V2,V3), which=TRUE, nomatch=0]
  if (length(w)){
    w2 = c(w, w + 1L)
    rbind(
      .SD[-w2],
      copy(.SD[w])[, place := "z"]  
    )
  } else .SD
}, by=id], id, seq)

给出

   id place seq
1:  1     a   1
2:  1     z   2
3:  1     d   4
4:  2     a   1
5:  2     b   2
6:  2     d   3
7:  2     e   4

使用对序列b,c,d的连接找到位置w.从那里,我们确定要删除的行(w加上其后的一行);保留哪些行(w);以及要在其中进行哪些修改(位置:="z").

Positions w are found using a join against the sequence b, c, d. From there, we identify which rows to drop (w plus the one after it); which rows to keep (w); and what to modify in them (place := "z").

可以对此进行概括的方向太多了,因此,如果出现更复杂的情况,最好只发布一个新问题.

There are too many different directions in which this might be generalized, so probably better to just post a new question if a more complicated case comes up.

这篇关于用单个行替换data.table中的行集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆