具有先前结果的列 [英] column with previous result

查看:74
本文介绍了具有先前结果的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用R

我所拥有的:

      ID_1     ID_2      Date        x_1        y_2     
1      12       3     2011-12-21       15        10     
2      12       13    2011-12-22       50        40     
3      3        12    2011-12-22       20        30     
4      15       13    2011-12-23       30        20     
...
and so on

目标:

      ID_1     ID_2      Date        x_1        y_2     XX_1      YY_2
1      12       3     2011-12-21       15        10      0         0
2      12       13    2011-12-22       50        40      15        0
3      3        12    2011-12-22       20        30      10        50
4      15       13    2011-12-23       30        20      0         40
...
and so on

我想看看在 XX_1 YY_2 中,列 x中的值_1 y_2 对应于以前的 ID_1 的值ID1_2 0 中的ID1_2(如果在该日期之前没有可用的值)。我不知道如何处理 ID_1 ID_2 中可能存在不同值的事实(例如ID 3和ID在示例中为12)。

I want to see in XX_1 and in YY_2 the values from the columns x_1 and y_2 corresponding to the previous values of ID_1 and ID1_2 in or "0" in case of no value is available before that date. I don't know how to handle the fact that different values could be in ID_1 and ID_2 (like IDs 3 and 12 in the example).

@Ekatef
ID1和ID2(即使ID顺序切换,也可以找到整个ID行的匹配项):

@Ekatef ID1 AND ID2 (find match of the whole ID row, even if the order of IDs is switched):

      ID_1     ID_2      Date        x_1        y_2     XX_1      YY_2
1      12       3     2011-12-21       15        10      0         0
2      12       13    2011-12-22       50        40      0         0
3      3        12    2011-12-22       20        30      10        15
4      15       13    2011-12-23       30        20      0         0
5      12       13    2011-12-23       10        5       50        40


推荐答案

OP已请求将 ID 的先前值(如果有)复制到相应的新列中。

The OP has requested to copy the previous value for an ID (if any) to the appropriate new column.

这可以通过同时重塑多列,从宽到长g格式,通过移动/滞后找到先前的值,然后重整为宽格式:

This can solved by reshaping multiple columns simultaneously from wide to long format, finding the previous value by shifting / lagging, and reshaping back to wide format:

library(data.table)
setDT(DF)[, rn := .I]
long <- melt(DF, id.vars = c("rn", "Date"), measure.vars = patterns("^ID", "^x|y"),
             value.name = c("ID", "value"))
long[order(Date), previous := shift(value, fill = 0), by = ID]
dcast(long, rn + Date ~ variable, value.var = c("ID", "value", "previous"))



   rn       Date ID_1 ID_2 value_1 value_2 previous_1 previous_2
1:  1 2011-12-21   12    3      15      10          0          0
2:  2 2011-12-22   12   13      50      40         15          0
3:  3 2011-12-22    3   12      20      30         10         50
4:  4 2011-12-23   15   13      30      20          0         40


Alternativ ely,对 dcast()的最终调用可以在加入时用 update代替

Alternatively, the final call to dcast() can be replaced by an update while joining:

DF[long, on = .(rn), 
   c("XX_1", "YY_2") := .(previous[variable == 1L], previous[variable == 2L])][
     , rn := NULL]
DF



   ID_1 ID_2       Date x_1 y_2 XX_1 YY_2
1:   12    3 2011-12-21  15  10    0    0
2:   12   13 2011-12-22  50  40   15    0
3:    3   12 2011-12-22  20  30   10   50
4:   15   13 2011-12-23  30  20    0   40


可以精确地再现OP的预期结果。

which reproduces exactly OP's expected result.

library(data.table)
DF <- fread(
  "i      ID_1     ID_2      Date        x_1        y_2     
  1      12       3     2011-12-21       15        10     
  2      12       13    2011-12-22       50        40     
  3      3        12    2011-12-22       20        30     
  4      15       13    2011-12-23       30        20  ",
  drop = 1L
)

这篇关于具有先前结果的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆