r - 为缺失的月度数据插入行并进行插值 [英] r - insert row for missing monthly data and interpolate
问题描述
我有一个如下的数据框,其中包含 5000 多行.我正在尝试插入缺少月份的行,例如下面第 6 个月 - 然后利用线性插值计算TWS"值.理想情况下,十进制日期也将适当填写,但如果没有,我可以在之后对其进行排序!数据框是 10 年(2003-2012 年)的月份 1:12,但对于多个网格方块重复.
I have a data frame as below with 5000+ rows. I am trying to insert a row where the month is missing e.g. month 6 below - and then utilise linear interpolation to calculate the 'TWS' value. Ideally the Decimal Date would be filled appropriately too but I can sort this afterwards if not! The data frame is months 1:12 for 10 years (2003-2012) but this repeats for multiple grid squares.
我发现了许多其他类似的问题,但与重复的每月 1:12 序列无关.
I have found lots other similar questions but not relating to a repeating 1:12 monthly sequence.
> head(ts.data,20)
GridNo GridIndex Lon Lat DecimDate Year Month TWS
1 GR72 72 35.5 -4.5 2003.000 2003 01 14.2566781
2 GR72 72 35.5 -4.5 2003.083 2003 02 5.0413706
3 GR72 72 35.5 -4.5 2003.167 2003 03 3.8192721
4 GR72 72 35.5 -4.5 2003.250 2003 04 5.8706026
5 GR72 72 35.5 -4.5 2003.333 2003 05 7.8461188
6 GR72 72 35.5 -4.5 2003.500 2003 07 2.3821844
7 GR72 72 35.5 -4.5 2003.583 2003 08 0.1995629
8 GR72 72 35.5 -4.5 2003.667 2003 09 -1.8353604
9 GR72 72 35.5 -4.5 2003.750 2003 10 -2.0410653
10 GR72 72 35.5 -4.5 2003.833 2003 11 -1.4029813
11 GR72 72 35.5 -4.5 2003.917 2003 12 -0.2206872
12 GR72 72 35.5 -4.5 2004.000 2004 01 -0.5090872
13 GR72 72 35.5 -4.5 2004.083 2004 02 -0.4887118
14 GR72 72 35.5 -4.5 2004.167 2004 03 -0.7725966
15 GR72 72 35.5 -4.5 2004.250 2004 04 4.1831581
16 GR72 72 35.5 -4.5 2004.333 2004 05 2.5651040
17 GR72 72 35.5 -4.5 2004.417 2004 06 -2.2511409
18 GR72 72 35.5 -4.5 2004.500 2004 07 -1.6484375
19 GR72 72 35.5 -4.5 2004.583 2004 08 -4.6508982
20 GR72 72 35.5 -4.5 2004.667 2004 09 -5.0053745
任何帮助表示赞赏!
推荐答案
使用 data.table
和 zoo
包,您可以轻松扩展数据集和插值,只要你没有 NA
s 在这两个年份的大小
Using data.table
and zoo
packages you can easily expand your data set and interpolate as long as you don't have NA
s at both sizes of the year
扩展数据集
library(data.table)
library(zoo)
res <- setDT(df)[, .SD[match(1:12, Month)], by = Year]
在你想要的任何列上插入
Interpolate on whatever column you want
cols <- c("Month", "DecimDate", "TWS")
res[, (cols) := lapply(.SD, na.approx, na.rm = FALSE), .SDcols = cols]
res
# Year GridNo GridIndex Lon Lat DecimDate Month TWS
# 1: 2003 GR72 72 35.5 -4.5 2003.000 1 14.2566781
# 2: 2003 GR72 72 35.5 -4.5 2003.083 2 5.0413706
# 3: 2003 GR72 72 35.5 -4.5 2003.167 3 3.8192721
# 4: 2003 GR72 72 35.5 -4.5 2003.250 4 5.8706026
# 5: 2003 GR72 72 35.5 -4.5 2003.333 5 7.8461188
# 6: 2003 NA NA NA NA 2003.417 6 5.1141516
# 7: 2003 GR72 72 35.5 -4.5 2003.500 7 2.3821844
# 8: 2003 GR72 72 35.5 -4.5 2003.583 8 0.1995629
# 9: 2003 GR72 72 35.5 -4.5 2003.667 9 -1.8353604
# 10: 2003 GR72 72 35.5 -4.5 2003.750 10 -2.0410653
# 11: 2003 GR72 72 35.5 -4.5 2003.833 11 -1.4029813
# 12: 2003 GR72 72 35.5 -4.5 2003.917 12 -0.2206872
# 13: 2004 GR72 72 35.5 -4.5 2004.000 1 -0.5090872
# 14: 2004 GR72 72 35.5 -4.5 2004.083 2 -0.4887118
# 15: 2004 GR72 72 35.5 -4.5 2004.167 3 -0.7725966
# 16: 2004 GR72 72 35.5 -4.5 2004.250 4 4.1831581
# 17: 2004 GR72 72 35.5 -4.5 2004.333 5 2.5651040
# 18: 2004 GR72 72 35.5 -4.5 2004.417 6 -2.2511409
# 19: 2004 GR72 72 35.5 -4.5 2004.500 7 -1.6484375
# 20: 2004 GR72 72 35.5 -4.5 2004.583 8 -4.6508982
# 21: 2004 GR72 72 35.5 -4.5 2004.667 9 -5.0053745
# 22: 2004 NA NA NA NA NA NA NA
# 23: 2004 NA NA NA NA NA NA NA
# 24: 2004 NA NA NA NA NA NA NA
这篇关于r - 为缺失的月度数据插入行并进行插值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!