r沿向量搜索并计算均值 [英] r search along a vector and calculate the mean

查看:95
本文介绍了r沿向量搜索并计算均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据如下:

require(data.table)
DT <- data.table(x=c(19,19,19,21,21,19,19,22,22,22),
             y=c(53,54,55,32,44,45,49,56,57,58))

我想沿着x搜索,并计算y的均值. 但是,使用时.

I would like to search along x, and calculate the means for y. However, when using.

DT[, .(my=mean(y)), by=.(x)]

我得到x的一致值的总体平均值. 我想沿着x搜索,每次x改变时,我想计算一个新的均值.对于所提供的示例,输出为:

I get the overall means for the coinciding values of x. I would like to search along x, and each time x changes, I would like to calculate a new mean. For the provided example, the output would be:

DTans <- data.table(x=c(19,21,19,22),
             my=c(54,38,47,57))

推荐答案

我们可以使用rleid创建另一个分组变量,获取'y'的mean,然后将'indx'分配给NULL

We could use rleid to create another grouping variable, get the mean of 'y', and assign the 'indx' to NULL

library(data.table) # v 1.9.5+
DT[, .(my = mean(y)), by = .(indx = rleid(x), x)][, indx := NULL]
#    x my
#1: 19 54
#2: 21 38
#3: 19 47
#4: 22 57

基准

set.seed(24)
foo <- function(x) sample(x, 1e7L, replace = TRUE)
DT  <- data.table(x = foo(100L), y = foo(10000L))

josilber <- function() {
    new.group <- c(1, diff(DT$x) != 0)
    res <- data.table(x = DT$x[new.group == 1], 
              my = tapply(DT$y, cumsum(new.group), mean))
}

Roland <- function() {
    DT[, .(my = mean(y), x = x[1]), by = cumsum(c(1, diff(x) != 0))]
}

akrun <- function() { 
    DT[, .(my = mean(y)), by = .(indx = rleid(x), x)][,indx := NULL]
}

bgoldst <- function() {
    with(rle(DT$x), data.frame(x = values, 
       my = tapply(DT$y, rep(1:length(lengths), lengths), mean)))
}

system.time(josilber())
#   user  system elapsed 
#159.405   1.759 161.110 

system.time(bgoldst())
#   user  system elapsed 
#162.628   0.782 163.380 

system.time(Roland())
#   user  system elapsed 
# 18.633   0.052  18.678 

system.time(akrun())
#   user  system elapsed 
# 1.242   0.003   1.246 

这篇关于r沿向量搜索并计算均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆