时间序列异常的计算 [英] calculation of anomalies on time-series
问题描述
我想根据一个具有多个测站的时间序列来计算月度温度异常. 我在这里称异常"是指单个值与按周期计算出的平均值之间的差.
I'd like to calculate monthly temperature anomalies on a time-series with several stations. I call here "anomaly" the difference of a single value from a mean calculated on a period.
我的数据框看起来像这样(我们称其为数据"):
My data frame looks like this (let's call it "data"):
Station Year Month Temp
A 1950 1 15.6
A 1980 1 12.3
A 1990 2 11.4
A 1950 1 15.6
B 1970 1 12.3
B 1977 2 11.4
B 1977 4 18.6
B 1980 1 12.3
B 1990 11 7.4
首先,我制作了一个包含1980年至1990年的年份的子集:
First, I made a subset with the years comprised between 1980 and 1990:
data2 <- subset(data, Year>=1980& Year<=1990)
第二,我用plyr计算了1980年到1990年之间每个站点的月平均值(简称为"MeanBase"):
Second, I used plyr to calculate monthly mean (let's call this "MeanBase") between 1980 and 1990 for each station:
data3 <- ddply(data2, .(Station, Month), summarise,
MeanBase = mean(Temp, na.rm=TRUE))
现在,我想为每行数据计算相应的MeanBase和Temp的值之间的差...但是我不确定采用正确的方式(我看不到如何使用data3).
Now, I'd like to calculate, for each line of data, the difference between the corresponding MeanBase and the value of Temp... but I'm not sure to be in the right way (I don't see how to use data3).
推荐答案
您可以在base R中使用ave
来获取它.
You can use ave
in base R to get this.
transform(data,
Demeaned=Temp - ave(replace(Temp, Year < 1980 | Year > 1990, NA),
Station, Month, FUN=function(t) mean(t, na.rm=TRUE)))
# Station Year Month Temp Demeaned
# 1 A 1950 1 15.6 3.3
# 2 A 1980 1 12.3 0.0
# 3 A 1990 2 11.4 0.0
# 4 A 1950 1 15.6 3.3
# 5 B 1970 1 12.3 0.0
# 6 B 1977 2 11.4 NaN
# 7 B 1977 4 18.6 NaN
# 8 B 1980 1 12.3 0.0
# 9 B 1990 11 7.4 0.0
对于月站组合,结果列将具有NaN
,而在指定范围内没有年份.
The result column will have NaN
s for Month-Station combinations that have no years in your specified range.
这篇关于时间序列异常的计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!