通过组移动窗口来区分计数 [英] Count distinct by group- moving window

查看：85 发布时间：2020/10/22 18:37:51 r group-by dplyr sum distinct

本文介绍了通过组移动窗口来区分计数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设我有一个数据集，其中包含医院就诊次数。我的目标是生成一个变量，该变量计算访问者在访问日期之前见过的唯一患者的数量。我经常与dplyr的group_by一起工作，但这似乎有些棘手。我想我将不得不使用group_by，n_distinct和sum或某种移动窗口命令。我需要目标变量。

Let's say I have a dataset contain visits in a hospital. My goal is to generate a variable that counts the number of unique patients the visitor has seen before at the date of the visit. I often work with group_by by dplyr but this seems a little tricky. I guess I would have to use group_by, n_distinct, and sum or some kind moving window command. The "goal" variable is what I need.

visitor visitdt patient goal
125469  1/12/2018   15200   1
125469  1/19/2018   15200   1
125469  2/16/2018   15200   1
125469  2/23/2018   52607   2
125469  3/9/2018    52607   2
125469  3/16/2018   52607   2
125469  3/23/2018   15200   2
125469  3/29/2018   15200   2
125469  3/30/2018   20589   3
125469  4/6/2018    20589   3

谢谢，
Marvin

Thanks, Marvin

推荐答案

您可以执行以下操作：

with(df, ave(patient, visitor, FUN = function(x) cumsum(!duplicated(x))))

 [1] 1 1 1 2 2 2 2 2 3 3

本质上，它是每个组中非重复值的累积和。

Essentially, it is a cumulative sum of non-duplicated values per group.

您也可以执行相同操作与 dplyr ：

And you can also do the same with dplyr:

df %>%
 group_by(visitor) %>%
 mutate(res = cumsum(!duplicated(patient)))

这篇关于通过组移动窗口来区分计数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

通过组移动窗口来区分计数 [英] Count distinct by group- moving window

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

通过组移动窗口来区分计数 [英] Count distinct by group- moving window

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭