每个用户过去7天解决的问题数量的总和 [英] Rolling Sum for number of questions solved for past 7 days by every user

查看:70
本文介绍了每个用户过去7天解决的问题数量的总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出了 User_Id 解决日期 NumberOfQuestions 的结果对于每个用户,如何获取过去7天(当前日期除外)的滚动总和。我使用for循环和遍历每一行的方法解决了这个问题,但是正在寻找一种替代方式并且易于阅读代码,例如使用dplyr

Given User_Id, Solved_Date, and NumberOfQuestions solved for every User, how to obtain rolling Sum for Past 7 Days excluding the current Date. I solved this using for loop and looping over every Row but looking for an alternative way and easy to read code like using dplyr

Input:
fread("
USER_ID SOLVED_DATE QUESTIONS_SOLVED
          1 2019-05-01        1
          1 2019-05-02        1
          2 2019-05-02        1
          2 2019-05-03        1
          2 2019-05-04        1
          2 2019-05-05        1
          1 2019-05-06        1
          2 2019-05-06        1
          1 2019-05-07        1
          2 2019-05-07        1
          1 2019-05-08        1
          2 2019-05-08        1
          1 2019-05-09        1
          1 2019-05-10        1
          1 2019-05-11        1
          2 2019-05-11        1
          2 2019-05-12        1
          2 2019-05-13        1
          2 2019-05-14        1
          2 2019-05-15        1
          1 2019-05-16        1")

OutPut:
USER_ID SOLVED_DATE NUMBER_QUESTIONS_SOLVED PAST_7DAY_SOLVED_HISTORY
           1 2019-05-01        1                       0
           1 2019-05-02        1                       1
           2 2019-05-02        1                       0
           2 2019-05-03        1                       1
           2 2019-05-04        1                       2
           1 2019-05-05        1                       2
           2 2019-05-05        1                       3
           1 2019-05-06        1                       3
           2 2019-05-06        1                       4
           1 2019-05-07        1                       4
           2 2019-05-07        1                       5
           1 2019-05-08        1                       4
           2 2019-05-08        1                       6
           1 2019-05-09        1                       4
           1 2019-05-10        1                       5
           1 2019-05-11        1                       6
           2 2019-05-11        1                       4
           2 2019-05-12        1                       4
           2 2019-05-13        1                       4
           2 2019-05-14        1                       4
           2 2019-05-15        1                       4
           1 2019-05-16        1                       2


推荐答案

我们可以先将 SOLVED_DATE 转换为实际日期,然后使用 map_dbl purrr 中查找在过去7天内解决了多少个问题。

We could first convert SOLVED_DATE to actual dates and then use map_dbl from purrr to find how many questions are solved in the last 7 days.

library(dplyr)
library(purrr)

df %>%
  mutate(SOLVED_DATE = as.Date(SOLVED_DATE)) %>%
  group_by(USER_ID) %>%
  mutate(Order_History = map_dbl(SOLVED_DATE, 
        ~sum(QUESTIONS_SOLVED[(SOLVED_DATE > . - 7) & (SOLVED_DATE < .)])))

#   USER_ID SOLVED_DATE QUESTIONS_SOLVED Order_History
#     <int> <date>                 <int>         <dbl>
# 1       1 2019-05-01                 1             0
# 2       1 2019-05-02                 1             1
# 3       2 2019-05-02                 1             0
# 4       2 2019-05-03                 1             1
# 5       2 2019-05-04                 1             2
# 6       2 2019-05-05                 1             3
# 7       1 2019-05-06                 1             2
# 8       2 2019-05-06                 1             4
# 9       1 2019-05-07                 1             3
#10       2 2019-05-07                 1             5
# … with 11 more rows

这篇关于每个用户过去7天解决的问题数量的总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆