cumsum(),直到并包括dplyr中的当前日期 [英] cumsum() up to and including current date in dplyr

查看:81
本文介绍了cumsum(),直到并包括dplyr中的当前日期的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想计算当前日期之前(包括当日)的所有日期的值的累积总和.问题是我在同一日期有多个条目,因此,如果我使用cumsum,则对于同一日期发生的值我将获得不同的值:

I want to calculate cumulative sum of values over all dates prior to and including current date. The problem is that i have multiple entries for the same date, so if I use cumsum I get different values for the values that happened on the same date:

library(dplyr)
tribble(~date, ~value,
        "2017-01-01", 1,
        "2017-01-02", 2,
        "2017-01-02", 3,
        "2017-01-03", 4,
        "2017-01-03", 5,
        "2017-01-04", 6,
        "2017-01-09", 9) %>% 
  arrange(date) %>% 
  mutate(to_date=cumsum(value))
>
# A tibble: 7 x 3
        date value  to_date
       <chr> <dbl>    <dbl>
1 2017-01-01     1        1
2 2017-01-02     2        3
3 2017-01-02     3        6
4 2017-01-03     4       10
5 2017-01-03     5       15
6 2017-01-04     6       21
7 2017-01-09     9       30

是否有一种优雅的方式来获取以下输出:

Is there an elegant way of getting to the following output:

# A tibble: 7 x 3
        date value  to_date
       <chr> <dbl>    <dbl>
1 2017-01-01     1        1
2 2017-01-02     2        6
3 2017-01-02     3        6
4 2017-01-03     4       15
5 2017-01-03     5       15
6 2017-01-04     6       21
7 2017-01-09     9       30

由于各种原因(除其他因素外,我的表中还有许多其他字段)我无法在运行累积总计之前按数据进行汇总.我(可能)需要一个增长的窗口函数,该函数可以计算时间间隔的总数.

For various reasons (among other things that I have many more fields in my table) I can not afford to summarize by data prior to running cumulative total. I (likely) need a growing window function that can calculate totals for time intervals.

推荐答案

我们可以按'date'分组,然后获取 last 'to_date'

We can group_by 'date' and then get the last 'to_date'

df1 %>%
    group_by(date) %>%
    mutate(to_date = last(to_date))

这篇关于cumsum(),直到并包括dplyr中的当前日期的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆