计算数据帧中每13行的平均值 [英] Calculate the mean of every 13 rows in data frame
问题描述
我有2列和3659行的数据框 df
I have a data frame with 2 columns and 3659 row df
我正在尝试减少数据通过对该数据帧中的每10或13行进行平均来设置,因此我尝试了以下操作:
I am trying to reduce the data set by averaging every 10 or 13 rows in this data frame, so I tried the following :
# number of rows per group
n=13
# number of groups
n_grp=nrow(df)/n
round(n_grp,0)
# row indices (one vector per group)
idx_grp <- split(seq(df), rep(seq(n_grp), each = n))
# calculate the col means for all groups
res <- lapply(idx_grp, function(i) {
# subset of the data frame
tmp <- dat[i]
# calculate row means
colMeans(tmp, na.rm = TRUE)
})
# transform list into a data frame
dat2 <- as.data.frame(res)
但是,我无法将行数除以10或13,因为数据长度不是split变量的倍数。因此,我不确定该怎么办(我只想算出最后一组的均值-即使元素少于10个)
However, I can't divide my number of rows by 10 or 13 because data length is not a multiple of split variable. So I am not sure what should do then (I just want may be to calculate the mean of the last group -even with less than 10 elements)
一个,但结果是相同的:
I also tried this one, but the results are the same:
df1=split(df, sample(rep(1:301, 10)))
推荐答案
以下是使用的解决方案合计()
和 rep()
。
df <- data.frame(a=1:12, b=13:24 );
df;
## a b
## 1 1 13
## 2 2 14
## 3 3 15
## 4 4 16
## 5 5 17
## 6 6 18
## 7 7 19
## 8 8 20
## 9 9 21
## 10 10 22
## 11 11 23
## 12 12 24
n <- 5;
aggregate(df,list(rep(1:(nrow(df)%/%n+1),each=n,len=nrow(df))),mean)[-1];
## a b
## 1 3.0 15.0
## 2 8.0 20.0
## 3 11.5 23.5
此解决方案的重要部分处理<$ c $ nrow(df)被<$ c $不可除的问题c> n 指定了 len
参数(实际上完整的参数名称为 length.out
)的 rep()
,它会自动将组向量的上限设置为适当的长度。
The important part of this solution that handles the issue of non-divisibility of nrow(df)
by n
is specifying the len
parameter (actually the full parameter name is length.out
) of rep()
, which automatically caps the group vector to the appropriate length.
这篇关于计算数据帧中每13行的平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!