直到当前行的列中唯一值的累积数量 [英] Cumulative number of unique values in a column up to current row

查看:24
本文介绍了直到当前行的列中唯一值的累积数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,donorInfo,带有捐赠者信息:

I have a data frame, donorInfo, with donor information:

id        giftdate     giftamt
002       2001-01-05     25.00
033       2001-05-08     50.00
054       2001-09-22    125.00
125       2001-11-05     40.00
042       2001-12-04     75.00
...           ...         ...

我想创建一个列,显示截至该日期的唯一捐助者 ID 的累计数量.我认为它是这样的:

I'd like to create a column that shows the cumulative number of unique donor id's up to that date. I think it's something like:

donorInfo$numUnique <- apply/lapply (donorInfo, 1, FUN=nrow(unique(donorInfo$id)))

不幸的是,这不起作用,我想知道如何补救.感谢您的任何建议.

unfortunately this isn't working and I'm wondering how to remedy things. Thanks for any suggestions.

推荐答案

您可以使用 duplicated()cumsum() 来做到这一点(利用布尔值逻辑向量可以强制转换为数值向量):

You can do this with duplicated() and cumsum() (taking advantage of the fact that Boolean-valued logical vectors can be coerced to numeric vectors):

# Example data.frame with some duplicated ids
df <- read.table(text="
id   giftdate giftamt
 2 2001-01-05      25
33 2001-05-08      50
 2 2001-09-22     125
33 2001-11-05      40
42 2001-12-04      75", header=T)

cumsum(!duplicated(df$id))
# [1] 1 2 2 2 3

这篇关于直到当前行的列中唯一值的累积数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆