获取data.table中上一组的最后一行 [英] Get the last row of a previous group in data.table

查看:58
本文介绍了获取data.table中上一组的最后一行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的数据表的样子:

This is what my data table looks like:

library(data.table)
dt <- fread('
    Product  Group    LastProductOfPriorGroup
    A          1          NA
    B          1          NA
    C          2          B
    D          2          B
    E          2          B
    F          3          E
    G          3          E
')

LastProductOfPriorGroup 列是我想要的列。我正在尝试从上一组的最后一行中获取产品。因此,在前两行中,没有先前的组,因此它是 NA 。在第三行中,前一组1的最后一行中的乘积是 B 。我正在尝试通过

The LastProductOfPriorGroup column is my desired column. I am trying to fetch the product from last row of the prior group. So in the first two rows, there are no prior groups and therefore it is NA. In the third row, the product in the last row of the prior group 1 is B. I am trying to accomplish this by

dt[,LastGroupProduct:= shift(Product,1), by=shift(Group,1)]

无济于事。

推荐答案

您可以

dt[, newcol := shift(dt[, last(Product), by = Group]$V1)[.GRP], by = Group]

此会导致以下更新的 dt ,其中 newcol 将您所需的列与不必要的长名称匹配。 ;)

This results in the following updated dt, where newcol matches your desired column with the unnecessarily long name. ;)

   Product Group LastProductOfPriorGroup newcol
1:       A     1                      NA     NA
2:       B     1                      NA     NA
3:       C     2                       B      B
4:       D     2                       B      B
5:       E     2                       B      B
6:       F     3                       E      E
7:       G     3                       E      E

让我们从内到外分解代码。我将使用 ... 表示累积的代码:

Let's break the code down from the inside out. I will use ... to denote the accumulated code:


  • dt [,last(Product),by = Group] $ V1 从每个组中获取最后一个值作为字符向量。

  • shift(...)转换上次调用中的字符向量

  • dt [,newcol:= ... [。GRP],按=组] Group 分组,并使用内部 .GRP 用于索引的值

  • dt[, last(Product), by = Group]$V1 is getting the last values from each group as a character vector.
  • shift(...) shifts the character vector in the previous call
  • dt[, newcol := ...[.GRP], by = Group] groups by Group and uses the internal .GRP values for indexing

更新:弗兰克(Frank)在我的代码上方提出了一个很好的观点,即一次又一次地计算每个组的偏移。为避免这种情况,我们可以使用

Update: Frank brings up a good point about my code above calculating the shift for every group over and over again. To avoid that, we can use either

shifted <- shift(dt[, last(Product), Group]$V1)
dt[, newcol := shifted[.GRP], by = Group]

这样我们就不会计算每个组的班次。或者,我们可以在评论中采纳Frank的好建议,然后执行以下操作。

so that we don't calculate the shift for every group. Or, we can take Frank's nice suggestion in the comments and do the following.

dt[dt[, last(Product), by = Group][, v := shift(V1)], on="Group", newcol := i.v] 

这篇关于获取data.table中上一组的最后一行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆