基于R中字段的运行计数 [英] Running count based on field in R
本文介绍了基于R中字段的运行计数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有这种格式的数据集
User
1
2
3
2
3
1
1
现在,我想添加一列"count",该计数对用户的出现进行计数.我想要以以下格式输出.
Now I want to add a column saying count which counts the occurrence of the user. I want output in the below format.
User Count
1 1
2 1
3 1
2 2
3 2
1 2
1 3
我的解决方案很少,但是所有这些解决方案都有些慢.
I have few solutions but all those solutions are somewhat slow.
我的data.frame现在有100,000行,不久之后可能会达到100万行.我需要一个也很快的解决方案.
My data.frame has 100,000 rows now and soon it may go up to 1 million. I need a solution which is also fast.
推荐答案
您可以从我的"splitstackshape"包中使用 getanID
:
You can use getanID
from my "splitstackshape" package:
library(splitstackshape)
getanID(mydf, "User")
## User .id
## 1: 1 1
## 2: 2 1
## 3: 3 1
## 4: 2 2
## 5: 3 2
## 6: 1 2
## 7: 1 3
这本质上是一种使用"data.table"的方法,其外观类似于以下内容:
This is essentially an approach with "data.table" that looks something like the following:
as.data.table(mydf)[, count := seq(.N), by = "User"][]
这篇关于基于R中字段的运行计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文