与R中的data.table的累积计算(例如累积相关) [英] Cumulative Calculations (e.g. cumulative correlation) with data.table in R

查看:114
本文介绍了与R中的data.table的累积计算(例如累积相关)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在R中,我有一个data.table,有两个测量红色绿色累积相关性。

In R, I have a data.table with two measurements red and green and would like to calculate their cumulative correlation.

library(data.table)
DT <- data.table(red   = c(1, 2, 3, 4, 5,  6.5, 7.6, 8.7),
                 green = c(2, 4, 6, 8, 10, 12,  14,  16),
                 id    = 1:8)

如何在一个data.table命令中获取以下输出?

How can I get the following output within one data.table command?

...
> DT[1:5, cor(red, green)]
[1] 1                     # should go into row 5
> DT[1:6, cor(red, green)]
[1] 0.9970501             # should go into row 6, and so on ...
> DT[1:7, cor(red, green)]
[1] 0.9976889

Edit:
我知道它可以通过循环来解决,但我的data.table有大约100万行分组成较小的块,所以循环相当慢,我认为可能一些其他可能性。

I am aware that it can be solved by looping, but my data.table has about 1 million rows grouped into smaller chunks, so looping is rather slow and I thought there might be some other possibility.

推荐答案

基于我对类似问题的回答这里的累积方差,可以找到累积协方差作为

Building on my answer to the similar question here for cumulative variances, you can find cumulative covariances as

library(dplyr) # for cummean
cum_cov <- function(x, y){
  n <- 1:length(x)
  res <- cumsum(x*y) - cummean(x)*cumsum(y) - cummean(y)*cumsum(x) + n*cummean(x)*cummean(y)
  res / (n-1)
}

cum_var <- function(x){# copy-pasted from previous answer
    n <- 1:length(x)
    (cumsum(x^2) - n*cummean(x)^2) / (n-1)
}

累计相关性则为

cum_cor <- function(x, y) cum_cov(x, y)/sqrt(cum_var(x)*cum_var(y))
DT[, cumcor:=cum_cor(red, green),]
   red green id    cumcor
1: 1.0     2  1       NaN
2: 2.0     4  2 1.0000000
3: 3.0     6  3 1.0000000
4: 4.0     8  4 1.0000000
5: 5.0    10  5 1.0000000
6: 6.5    12  6 0.9970501
7: 7.6    14  7 0.9976889
8: 8.7    16  8 0.9983762

我希望速度够快

x <- rnorm(1e6)
y <- rnorm(1e6)+x
system.time(cum_cor(x, y))
#   user  system elapsed 
#  0.319   0.020   0.339 

这篇关于与R中的data.table的累积计算(例如累积相关)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆