在data.table中添加缺少值的行 [英] Adding row for missing value in data.table
问题描述
我的问题是以某种方式与最快的方式来添加行的缺失值在data.frame?但有点更强硬我想。
这里是我的data.table看起来像:
ida idb value date
区别是每个ida都有自己的日期,并且至少有一行在每个日期出现ida,但不一定是所有日期'idb'。我想插入每一个丢失的('ida','idb')对象与对应的日期和0作为一个值。
1:A 2 26600 2004-12-31
2:A 3 19600 2005-03-31
3:B 3 18200 2005-06-30
4:B 4 1230 2005-09-30
5:C 2 8700 2005-12-31
此外,没有周期性
您希望如何处理?
所需输出:
b $ bida idb value date
1:A 2 26600 2004-12-31
1:A 2 0 2005-03-31
2:A 3 19600 2005-03-31
2:A 3 0 2004-12-31
3:B 3 18200 2005-06-30
4:B 3 0 2005- 09-30
5:B 4 1230 2005-09-30
4:B 4 0 2005-06-30
6:C 2 8700 2005-12-31
顺序无关紧要。
解决方案每个<$> c $ c> ida :
setkey(dt,idb,date)
b $ b dt [,.SD [CJ(unique(idb),unique(date))],by = ida] [is.na(value),value:= 0] []
#ida idb value日期
#1:A 2 26600 2004-12-31
#2:A 2 0 2005-03-31
#3:A 3 0 2004-12-31
#4:A 3 19600 2005-03-31
#5:C 2 8700 2005-12-31
#6:B 3 18200 2005-06-30
#7:B 3 0 2005-09-30
#8:B 4 0 2005-06-30
#9:B 4 1230 2005-09-30
My question is somehow related to Fastest way to add rows for missing values in a data.frame? but a bit tougher I think. And I can't figure out how to adapt this solution to my problem.
Here is what my data.table looks like :
ida idb value date 1: A 2 26600 2004-12-31 2: A 3 19600 2005-03-31 3: B 3 18200 2005-06-30 4: B 4 1230 2005-09-30 5: C 2 8700 2005-12-31
The difference is that every 'ida' has his own dates and there is at least one row where 'ida' appears with each date but not necessarily for all 'idb'. I want to insert every missing ('ida','idb') couple missing with the corresponding date and 0 as a value.
Moreover, there is no periodicity for the dates.
How would you do this ?
Desired output :
ida idb value date 1: A 2 26600 2004-12-31 1: A 2 0 2005-03-31 2: A 3 19600 2005-03-31 2: A 3 0 2004-12-31 3: B 3 18200 2005-06-30 4: B 3 0 2005-09-30 5: B 4 1230 2005-09-30 4: B 4 0 2005-06-30 6: C 2 8700 2005-12-31
The order doesn't matter. Every date missing is filled with a 0 value.
解决方案You just do the same thing as in your linked question by each
ida
:setkey(dt, idb, date) dt[, .SD[CJ(unique(idb), unique(date))], by = ida][is.na(value), value := 0][] # ida idb value date #1: A 2 26600 2004-12-31 #2: A 2 0 2005-03-31 #3: A 3 0 2004-12-31 #4: A 3 19600 2005-03-31 #5: C 2 8700 2005-12-31 #6: B 3 18200 2005-06-30 #7: B 3 0 2005-09-30 #8: B 4 0 2005-06-30 #9: B 4 1230 2005-09-30
这篇关于在data.table中添加缺少值的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!