在数据框的特定列中创建第n个对象的平均平均值 [英] Creating the mean average of every nth object in a specific column of a dataframe
问题描述
我正在尝试使用以下代码对数据框中特定列的第n个对象取平均值.我知道使用for循环在计算上效率低下.这就是为什么我想问一下是否有一种更有效的方法来创建每第n行的平均值?我的数据看起来像这样.
I am trying to average every n-th object of a specific column in a dataframe using the following code. I understand that using the for-loop is computationally inefficient. This is why I would like to ask whether there is a more efficient way to create the average of every n-th row? My data looks a little bit like this.
set.seed(6218)
n <- 8760
s1 <- sample(30000:70000, n)
s2 <- sample(0:10000, n)
inDf <- cbind(s1, s2)
我这样称呼h_average: h_average(inDf,24,1,1) 这将意味着我平均每" 24点子集的每个第一点.因此,点1、25、49、73等...我也只在第一列中做.
I call h_average like this: h_average(inDf, 24, 1, 1) This would mean that I average every first point of "every" 24 point subset. So the points 1, 25, 49, 73,... Also I only do this for the first column.
预先感谢, BenR
Thanks in advance, BenR
#' h_average
#'
#' Computing the average of every first, second, third, ... hour of the day/week
#'
#' @param data merged data
#' @param tstep hour-step representing the number of hours for a day/week
#' @param h hour, which should be averaged. Should be between 1 - 24/1 - 168.
#' @param x column number
#' @return mean average of the specific hour
h_average <- function(data, tstep, h, x) {
sum_1 <- 0
sum_2 <- 0
mean <- 0
for (i in seq(h, nrow(data), tstep)){
if(data[i,x]){
sum_1 <- sum_1 + 1
sum_2 <- sum_2 + data[i,x]
}
}
mean <- sum_2/sum_1
return(mean)
}
推荐答案
只需结合使用rowMeans
和子集.像这样:
Just use a combination of rowMeans
and subsetting. So something like:
n = 5
rowMeans(data[seq(1, nrow(data), n),])
或者,您可以使用apply
## rowMeans is better, but
## if you wanted to calculate the median (say)
## Just change mean to median below
apply(data[seq(1, nrow(data), n),], 1, mean)
这篇关于在数据框的特定列中创建第n个对象的平均平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!