计算数据帧中每秒的值的平均值 [英] Calculating mean for every second value in a dataframe
问题描述
要更准确地看到以下数据框提取:
XYZ
1 FRI 200101010000 -6.72
2 FRI 200101010030 -6.30
3 FRI 200101010100 -6.26
4 FRI 200101010130 -5.82
5 FRI 200101010200 -5.64
6 FRI 200101010230 -5.29
7 FRI 200101010300 -5.82
8 FRI 200101010330 -5.83
9 FRI 200101010400 -5.83
10 FRI 200101010430 -6.04
11 FRI 200101010500 -5.80
12 FRI 200101010530 -6.09
我想用Y和Y结尾的每个Z的平均值计算Y,这意味着计算的平均值为#行1 + 2,#row 3 + 4,#row 5 + 6等等...看看我在这里期望的内容:
XYZ
pre>
1 FRI 200101010100 -6.51
2 FRI 200101010200 -6.04
3 FRI 200101010300 -5.47
...
说明:Y是时间:YYYYMMDDhhmm,我想要t o平均测量30分钟到1小时的测量值
解决方案这是一个可能的
data.table
解决方案library(data.table)
setDT(df)[,。(Y = Y [ 1L],Z =均值(Z)),by =。(X,indx = cumsum(substr(Y,11,12)=='00'))]
#X indx YZ
# 1:FRI 1 200101010000 -6.510
#2:FRI 2 200101010100 -6.040
#3:FRI 3 200101010200 -5.465
#4:FRI 4 200101010300 -5.825
#5: FRI 5 200101010400 -5.935
#6:FRI 6 200101010500 -5.945
或每个@akruns使用
聚合
从基础(虽然输出可能需要一些额外的tweeking可能)aggregate(Z〜X + indx,transform(df,indx = cumsum(substr(Y,11,12)=='00')),mean)
/ pre>
I would like to aggregate each two cell values by mean and continue with the same process down the column of the dataframe. To be more precise see the following dataframe extract:
X Y Z 1 FRI 200101010000 -6.72 2 FRI 200101010030 -6.30 3 FRI 200101010100 -6.26 4 FRI 200101010130 -5.82 5 FRI 200101010200 -5.64 6 FRI 200101010230 -5.29 7 FRI 200101010300 -5.82 8 FRI 200101010330 -5.83 9 FRI 200101010400 -5.83 10 FRI 200101010430 -6.04 11 FRI 200101010500 -5.80 12 FRI 200101010530 -6.09
I would like to calculate the mean of every Z by Y ending with 00 and 30, that means calculate mean of #row 1+2, #row 3+4, #row 5+6 and so on...see what I expect here:
X Y Z 1 FRI 200101010100 -6.51 2 FRI 200101010200 -6.04 3 FRI 200101010300 -5.47 ...
Explanation: Y is time: YYYYMMDDhhmm and I would like to average measurements of 30min to measurements of 1h
解决方案Here's a possible
data.table
solutionlibrary(data.table) setDT(df)[, .(Y = Y[1L], Z = mean(Z)), by = .(X, indx = cumsum(substr(Y, 11, 12) == '00'))] # X indx Y Z # 1: FRI 1 200101010000 -6.510 # 2: FRI 2 200101010100 -6.040 # 3: FRI 3 200101010200 -5.465 # 4: FRI 4 200101010300 -5.825 # 5: FRI 5 200101010400 -5.935 # 6: FRI 6 200101010500 -5.945
Or per @akruns comment, using
aggregate
from base (though the output will need some additional tweeking probably)aggregate(Z ~ X + indx, transform(df, indx = cumsum(substr(Y, 11, 12) == '00')), mean)
这篇关于计算数据帧中每秒的值的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!