计算数据帧中每秒的值的平均值 [英] Calculating mean for every second value in a dataframe

查看:136
本文介绍了计算数据帧中每秒的值的平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过平均值聚合每两个单元格值,并在数据框的列下继续使用相同的过程。
要更准确地看到以下数据框提取:

  XYZ 
1 FRI 200101010000 -6.72
2 FRI 200101010030 -6.30
3 FRI 200101010100 -6.26
4 FRI 200101010130 -5.82
5 FRI 200101010200 -5.64
6 FRI 200101010230 -5.29
7 FRI 200101010300 -5.82
8 FRI 200101010330 -5.83
9 FRI 200101010400 -5.83
10 FRI 200101010430 -6.04
11 FRI 200101010500 -5.80
12 FRI 200101010530 -6.09

我想用Y和Y结尾的每个Z的平均值计算Y,这意味着计算的平均值为#行1 + 2,#row 3 + 4,#row 5 + 6等等...看看我在这里期望的内容:

  XYZ 
1 FRI 200101010100 -6.51
2 FRI 200101010200 -6.04
3 FRI 200101010300 -5.47
...
pre>

说明:Y是时间:YYYYMMDDhhmm,我想要t o平均测量30分钟到1小时的测量值

解决方案

这是一个可能的 data.table 解决方案

  library(data.table)
setDT(df)[,。(Y = Y [ 1L],Z =均值(Z)),by =。(X,indx = cumsum(substr(Y,11,12)=='00'))]
#X indx YZ
# 1:FRI 1 200101010000 -6.510
#2:FRI 2 200101010100 -6.040
#3:FRI 3 200101010200 -5.465
#4:FRI 4 200101010300 -5.825
#5: FRI 5 200101010400 -5.935
#6:FRI 6 200101010500 -5.945

或每个@akruns使用聚合从基础(虽然输出可能需要一些额外的tweeking可能)

  aggregate(Z〜X + indx,transform(df,indx = cumsum(substr(Y,11,12)=='00')),mean)
/ pre>

I would like to aggregate each two cell values by mean and continue with the same process down the column of the dataframe. To be more precise see the following dataframe extract:

    X         Y             Z
1   FRI 200101010000    -6.72
2   FRI 200101010030    -6.30
3   FRI 200101010100    -6.26
4   FRI 200101010130    -5.82
5   FRI 200101010200    -5.64
6   FRI 200101010230    -5.29
7   FRI 200101010300    -5.82
8   FRI 200101010330    -5.83
9   FRI 200101010400    -5.83
10  FRI 200101010430    -6.04
11  FRI 200101010500    -5.80
12  FRI 200101010530    -6.09

I would like to calculate the mean of every Z by Y ending with 00 and 30, that means calculate mean of #row 1+2, #row 3+4, #row 5+6 and so on...see what I expect here:

    X         Y             Z
1   FRI 200101010100    -6.51
2   FRI 200101010200    -6.04
3   FRI 200101010300    -5.47
...

Explanation: Y is time: YYYYMMDDhhmm and I would like to average measurements of 30min to measurements of 1h

解决方案

Here's a possible data.table solution

library(data.table)
setDT(df)[, .(Y = Y[1L], Z = mean(Z)), by = .(X, indx = cumsum(substr(Y, 11, 12) == '00'))]
#      X indx            Y      Z
# 1: FRI    1 200101010000 -6.510
# 2: FRI    2 200101010100 -6.040
# 3: FRI    3 200101010200 -5.465
# 4: FRI    4 200101010300 -5.825
# 5: FRI    5 200101010400 -5.935
# 6: FRI    6 200101010500 -5.945

Or per @akruns comment, using aggregate from base (though the output will need some additional tweeking probably)

aggregate(Z ~ X + indx, transform(df, indx = cumsum(substr(Y, 11, 12) == '00')), mean)

这篇关于计算数据帧中每秒的值的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆