R带有条件和重置的累积和 [英] R Cumulative Sum with a condition and a reset

查看:23
本文介绍了R带有条件和重置的累积和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个由 -1 和 1 组成的信号位置指示向量.另外,我有体积数据,我想根据 Signal 的值求和.基本数据表如下所示:

I have a signal position indicator vector consisting out of -1s and 1s. In addition, I have volume data which I want to sum based on the value of Signal. The basic data table looks like this:

df <- cbind(Signal, Volume)
head(df, 20)

           Signal    Volume
2016-01-04     NA  37912403
2016-01-05     -1  23258238
2016-01-06     -1  25096183
2016-01-07     -1  45172906
2016-01-08     -1  35402298
2016-01-11     -1  29932385
2016-01-12     -1  28395390
2016-01-13     -1  33410553
2016-01-14     -1  48658623
2016-01-15      1  46132781
2016-01-19      1  30998256
2016-01-20     -1  59051429
2016-01-21      1  30518939
2016-01-22      1  30495387
2016-01-25      1  32482015
2016-01-26     -1  26877080
2016-01-27     -1  58699359
2016-01-28      1 107475327
2016-01-29      1  62739548
2016-02-01      1  46132726

我想要实现的是(不使用 for 循环)生成一个 cum Volume 的向量,每次信号更改时都会重置该向量.此外,volume 的值应该乘以 Signal 的值,即当 Signal 为 -1 时,它应该将 -Volume 添加到当前的 cum Volume.基于类似的问题,我尝试过

What I would like to achieve is (without using a for loop) is to produce a vector of cum Volume, which would be reset every time the Signal changes. In addition, the values of volume should be multiplied by the value of the Signal, i.e. when Signal is -1 it should add -Volume to the current cum Volume. Based on a similar questions on SO I have tried

ave(df$a, cumsum(c(F, diff(sign(diff(df$a))) != 0)*df$Volume), FUN=seq_along) 

这会产生正确的信号分组,但由于某种原因不包括交易量.如果没有重置,解决方案相当简单(发布在 SO)

which produces the right grouping of Signal, but the Volume is not included for some reason. Without the reset the solution is fairly straightforward (posted on SO)

require(data.table)
DT <- data.table(dt)
DT[, Cum.Sum := cumsum(Volume), by=Signal]

有没有人知道用于重置和调节 cum sum 的 dplyr 或 data.table 类型的解决方案?谢谢.

Does anyone know a dplyr or data.table kind of solution for both resetting and conditioning a cum sum? Thanks.

推荐答案

可以通过以下方式实现:

This can be achieved by:

library(tidyverse)
library(data.table)     

z %>%
  group_by(rleid(Signal)) %>% #advance value every time Signal changes and group by that
  mutate(cum = Signal*cumsum(Volume)) %>% #cumsum in each group
  ungroup() %>% #ungroup so you could remove the grouping column
  select(-4) #remove grouping column

或不使用 data.table 使用 rle:

z %>%
  mutate(rl = rep(1:length(rle(Signal)$length), times = rle(Signal)$length)) %>%
  group_by(rl) %>%
  mutate(cum = Signal*cumsum(Volume)) %>%
  ungroup() %>%
  select(-4)

#output
    date       Signal    Volume        cum

  <fct>       <int>     <int>      <int>
 1 2016-01-04     NA  37912403         NA
 2 2016-01-05    - 1  23258238 - 23258238
 3 2016-01-06    - 1  25096183 - 48354421
 4 2016-01-07    - 1  45172906 - 93527327
 5 2016-01-08    - 1  35402298 -128929625
 6 2016-01-11    - 1  29932385 -158862010
 7 2016-01-12    - 1  28395390 -187257400
 8 2016-01-13    - 1  33410553 -220667953
 9 2016-01-14    - 1  48658623 -269326576
10 2016-01-15      1  46132781   46132781
11 2016-01-19      1  30998256   77131037
12 2016-01-20    - 1  59051429 - 59051429
13 2016-01-21      1  30518939   30518939
14 2016-01-22      1  30495387   61014326
15 2016-01-25      1  32482015   93496341
16 2016-01-26    - 1  26877080 - 26877080
17 2016-01-27    - 1  58699359 - 85576439
18 2016-01-28      1 107475327  107475327
19 2016-01-29      1  62739548  170214875
20 2016-02-01      1  46132726  216347601

数据:

z <- read.table(text =      "date     Signal    Volume
           2016-01-04     NA  37912403
           2016-01-05     -1  23258238
           2016-01-06     -1  25096183
           2016-01-07     -1  45172906
           2016-01-08     -1  35402298
           2016-01-11     -1  29932385
           2016-01-12     -1  28395390
           2016-01-13     -1  33410553
           2016-01-14     -1  48658623
           2016-01-15      1  46132781
           2016-01-19      1  30998256
           2016-01-20     -1  59051429
           2016-01-21      1  30518939
           2016-01-22      1  30495387
           2016-01-25      1  32482015
           2016-01-26     -1  26877080
           2016-01-27     -1  58699359
           2016-01-28      1 107475327
           2016-01-29      1  62739548
           2016-02-01      1  46132726", header = T)

这篇关于R带有条件和重置的累积和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆