具有先前结果的列 [英] column with previous result
问题描述
我正在使用R
我所拥有的:
ID_1 ID_2 Date x_1 y_2
1 12 3 2011-12-21 15 10
2 12 13 2011-12-22 50 40
3 3 12 2011-12-22 20 30
4 15 13 2011-12-23 30 20
...
and so on
目标:
ID_1 ID_2 Date x_1 y_2 XX_1 YY_2
1 12 3 2011-12-21 15 10 0 0
2 12 13 2011-12-22 50 40 15 0
3 3 12 2011-12-22 20 30 10 50
4 15 13 2011-12-23 30 20 0 40
...
and so on
我想看看在 XX_1
和 YY_2
中,列 x中的值_1
和 y_2
对应于以前的 ID_1
和的值ID1_2
或 0
中的ID1_2(如果在该日期之前没有可用的值)。我不知道如何处理 ID_1
和 ID_2
中可能存在不同值的事实(例如ID 3和ID在示例中为12)。
I want to see in XX_1
and in YY_2
the values from the columns x_1
and y_2
corresponding to the previous values of ID_1
and ID1_2
in or "0"
in case of no value is available before that date. I don't know how to handle the fact that different values could be in ID_1
and ID_2
(like IDs 3 and 12 in the example).
@Ekatef
ID1和ID2(即使ID顺序切换,也可以找到整个ID行的匹配项):
@Ekatef ID1 AND ID2 (find match of the whole ID row, even if the order of IDs is switched):
ID_1 ID_2 Date x_1 y_2 XX_1 YY_2
1 12 3 2011-12-21 15 10 0 0
2 12 13 2011-12-22 50 40 0 0
3 3 12 2011-12-22 20 30 10 15
4 15 13 2011-12-23 30 20 0 0
5 12 13 2011-12-23 10 5 50 40
推荐答案
OP已请求将 ID
的先前值(如果有)复制到相应的新列中。
The OP has requested to copy the previous value for an ID
(if any) to the appropriate new column.
这可以通过同时重塑多列,从宽到长g格式,通过移动/滞后找到先前的值,然后重整为宽格式:
This can solved by reshaping multiple columns simultaneously from wide to long format, finding the previous value by shifting / lagging, and reshaping back to wide format:
library(data.table)
setDT(DF)[, rn := .I]
long <- melt(DF, id.vars = c("rn", "Date"), measure.vars = patterns("^ID", "^x|y"),
value.name = c("ID", "value"))
long[order(Date), previous := shift(value, fill = 0), by = ID]
dcast(long, rn + Date ~ variable, value.var = c("ID", "value", "previous"))
rn Date ID_1 ID_2 value_1 value_2 previous_1 previous_2
1: 1 2011-12-21 12 3 15 10 0 0
2: 2 2011-12-22 12 13 50 40 15 0
3: 3 2011-12-22 3 12 20 30 10 50
4: 4 2011-12-23 15 13 30 20 0 40
Alternativ ely,对 dcast()
的最终调用可以在加入时用 update代替:
Alternatively, the final call to dcast()
can be replaced by an update while joining:
DF[long, on = .(rn),
c("XX_1", "YY_2") := .(previous[variable == 1L], previous[variable == 2L])][
, rn := NULL]
DF
ID_1 ID_2 Date x_1 y_2 XX_1 YY_2
1: 12 3 2011-12-21 15 10 0 0
2: 12 13 2011-12-22 50 40 15 0
3: 3 12 2011-12-22 20 30 10 50
4: 15 13 2011-12-23 30 20 0 40
可以精确地再现OP的预期结果。
which reproduces exactly OP's expected result.
library(data.table)
DF <- fread(
"i ID_1 ID_2 Date x_1 y_2
1 12 3 2011-12-21 15 10
2 12 13 2011-12-22 50 40
3 3 12 2011-12-22 20 30
4 15 13 2011-12-23 30 20 ",
drop = 1L
)
这篇关于具有先前结果的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!